What Data Leaves Your Device When You Use Apps On A Self-Hosted Server?

Jul 1

What data leaves your device when you use apps on a self-hosted server?

Less than the cloud, but not nothing. Self-hosting keeps the content (files, photos, messages, AI prompts) on hardware you control. What still leaves is connection-level traffic almost any device makes: DNS lookups, certificate checks, time sync, and the server name in TLS handshakes, plus app update checks and any push or federation you turn on. Most of it is an architecture choice you can shrink.

The first time you point a DNS log or a packet capture at your own self-hosted box, you get a small, useful shock. You moved your files off the cloud, stood up your own server, told yourself the data stays home now. Then you watch the traffic, and the box is still talking. Not loudly, and not your photos, but it is reaching out: a lookup here, a certificate check there, a steady tick to a time server somewhere.

The instinct is to feel cheated, like the privacy promise was a lie. It was not. What you are seeing is the difference between two very different kinds of leaving, and once you can tell them apart, the picture gets clearer instead of scarier. This post is the honest wiring diagram: what leaves a self-hosted server, what stays, and why nearly everything that still leaves is a decision you can change rather than a policy you have to trust. Self-hosting did not make you invisible. It was never going to. The point is what it did do instead.

Does Self-Hosting Mean Nothing Leaves Your Network?

No, and any honest answer starts there. Self-hosting moves your content (files, photos, messages, AI prompts) onto hardware you control, so the high-value payload stops leaving. But the device underneath still does DNS lookups, certificate checks, and time synchronization, and the operating system may phone home on its own. Self-hosting silences the application data layer, not the network itself.

Before any app is involved, a machine on the internet emits a predictable set of outbound connections. These are wiring facts, not threats, and naming them is the difference between an honest map and a marketing claim.

DNS lookups. Every time your server resolves a hostname (to reach an update server, a federation peer, a time pool, anything) it emits a DNS query. Unless encrypted, that query is visible to whoever runs the resolver and to anyone on the path. This is exactly why DNS logs are the standard tool for seeing what a device is doing: they reliably reveal telemetry endpoints, ad traffic, and chatty software.
Certificate revocation checks. When a client opens an HTTPS connection, it may check whether the server's certificate has been revoked. The Online Certificate Status Protocol (OCSP) does this by querying a responder, typically over port 80, for certificates in the chain. That check tells the responder, often the certificate authority, which site you are connecting to and when.
Time synchronization. Devices keep their clocks correct by reaching out to NTP servers. It is low-sensitivity traffic, but it is continuous, and TLS itself depends on it: OCSP responses are time-stamped, and a skewed clock makes valid certificates look invalid.
Operating-system telemetry. The host OS and its components phone home on their own schedule, independent of your apps.

Pi-hole deployments make the scale of this background noise visible, and the honest detail is that the junk is concentrated, not evenly spread. Across a whole network the blocked share can look modest: one documented sample blocked about 8,081 of 268,793 queries in 48 hours, roughly 3 percent. But the average hides where it lives. A streaming or smart-TV device of the Roku class has been measured at roughly two-thirds of its DNS lookups being ads and telemetry, while a working desktop sits closer to 1 percent. Self-hosting an app does not silence the device or the OS beneath it. It silences the application-layer data flow, the part that contains your content.

What Do Self-Hosted Apps Send on Their Own?

Mostly two things: update checks and optional telemetry. A self-hosted app still wants to know if it is out of date, so it checks a release endpoint. Mature projects increasingly ship telemetry off by default and let you choose what, if anything, reports. The difference from the cloud is that here the outbound call is a setting you own and can watch, not the product itself.

The cleanest illustration is Immich, the self-hosted photo platform. Its automatic version check originally sent HTTP requests straight to github.com to read the releases of the Immich repository. The maintainers moved that check to their own dedicated endpoint, version.immich.cloud, specifically to avoid every instance contacting GitHub and to stop hammering GitHub with requests. The change shipped through public pull requests, numbers 18572 and 27450. That is the whole topic in miniature: the outbound call did not disappear, but a project that treats egress as a design concern got to choose where it goes and minimize what it reveals.

Telemetry follows the same pattern when a project is disciplined about it. Nextcloud is the reference: its telemetry is disabled by default, and the administrator chooses what an installation reports, if anything. (Nextcloud separately prompts you to turn on a feedback or usage-statistics app, the source of the recurring "Help Nextcloud" notifications people mention. That prompt is opt-in, not silent collection.) The contrast with cloud SaaS is the point. In the cloud, telemetry is the product and you cannot turn it off. In a well-behaved self-hosted app, it is a setting you own.

That word "well-behaved" is doing real work. Not every self-hosted project is disciplined about its outbound calls, and the only way to be certain is to watch the traffic with a DNS log or an egress monitor. Self-hosting at least makes that possible. You cannot packet-capture a cloud provider's servers. You can capture your own.

What Metadata Leaks Even When the Payload Is Encrypted?

The fact and target of a connection, even when its contents are sealed. Encryption protects what you send, not that you sent it or to whom. Unencrypted DNS reveals the hostname; the TLS SNI field has historically named the server in plaintext; the destination IP is visible to your ISP; and traffic timing and volume can profile activity. Encrypted DNS and Encrypted Client Hello shrink this, but coverage is still partial.

This is the part that survives "but it is all HTTPS," so it is worth walking each piece alongside the lever that shrinks it. Plain DNS sends the hostname in the clear, and encrypting it with DoH or DoT is necessary but not sufficient on its own, because other parts of the connection still reveal the destination. The lever is real but partial, which is the shape of this whole section.

The TLS SNI field is the sharper example. Even with TLS 1.3, which encrypts most of the handshake, the Server Name Indication that names the server you are reaching has historically been sent in plaintext in the ClientHello. Any network observer (your ISP, a network admin, an on-path actor) can read it and learn which domain you are contacting, even though they cannot read the content. This is not theoretical: South Korea began inspecting SNI to enforce site blocking in February 2019.

The fix is Encrypted Client Hello, which encrypts the SNI and closes that last handshake gap, and its status is the part you have to state with care. ECH is real and deploying. Cloudflare enabled it by default for its customers in late 2023, and Firefox and Chrome added support in October 2023. But coverage is still thin. As measured in 2025, roughly 4.2 percent of the top 100,000 sites and 9.2 percent of the top 1,000,000 advertised ECH support, and server support remains concentrated at Cloudflare. ECH has also been actively blocked in Russia, from November 2024, as well as in Iran and China. So the SNI fix exists, but assuming it is universally on would be wrong.

Two residues do not fully close. The destination IP is inherently visible to your ISP, and timing and volume patterns can profile activity even when names and contents are hidden. These are the irreducible cost of using a network at all. Why it still matters for the self-hosting case: the connection metadata that remains is largely metadata about your own infrastructure, your server reaching an update endpoint, a time pool, a federation peer, not a continuous stream describing your behavior to a company whose business is your behavior. The volume drops and the sensitivity drops. But "drop" is the accurate verb, not "eliminate."

What Leaves When You Add Push, Federation, and Lookups?

This is where a self-hosted setup quietly touches third parties. Mobile push generally routes through Apple (APNs) and Google (FCM) via a push proxy. Federation, by design, exposes the communication metadata between servers. And media apps fetch external metadata keyed to the title you add. Each is a feature with a cost, and each has an architecture that reduces what leaves.

A self-hosted app in isolation is easy to keep local. The egress shows up when you add the conveniences that made the cloud version attractive. Three are worth naming precisely.

Mobile push notifications are the single most common place a self-hosted setup touches Apple and Google. A self-hosted server cannot push directly to a stock mobile OS, so it hands the notification to a push proxy that forwards it to APNs or FCM, which deliver it to the device. Mattermost and Zulip both document running your own push-proxy for exactly this reason, and Nextcloud routes its mobile notifications through a Push Proxy service rather than letting every server contact Google and Apple directly. The nuance is real: by default the notification can pass through APNs or FCM, and the delivery token must be shared with them. The first lever is an ID-only mode, where only an opaque message ID transits APNs or FCM and the device fetches the actual content directly from your server, so Apple and Google see that a message exists but not what it says. The fuller answer on Android is UnifiedPush, with a distributor such as ntfy or NextPush: a decentralized, open protocol that lets you receive push without Google's FCM at all, pointed at a server you choose. The caveat survives here too. Some apps still reach fcm.googleapis.com even under UnifiedPush depending on the build, and on iOS, fully bypassing APNs is not realistic for most apps. So these levers reduce what Apple and Google see; they do not zero it out on every app.

Federation is a feature, and features have a cost. A self-hosted Matrix homeserver keeps your account and message content under your control, and end-to-end encryption protects the message bodies. But federation is, by design, servers talking to other servers, and the metadata of that conversation is exposed between homeservers: who talks to whom, when, from which domain, plus device IDs and timestamps. End-to-end encryption conceals the contents, not the social graph. If you only ever talk to people on your own server, none of this leaves. The moment you federate, the metadata does. That is a deliberate trade worth naming plainly, rather than implying that end-to-end encryption makes a federated server fully private.

External metadata lookups are the quietest of the three. Self-hosted media apps often enrich your library from public databases. Jellyfin, for example, fetches movie and show metadata and artwork from providers such as TMDB, and optionally TheTVDB, fanart.tv, or AniDB. That means an outbound request keyed to the title you are adding. You can avoid it by supplying local .nfo metadata and artwork, in which case Jellyfin reads locally and the lookups stop. So the egress here is opt-out and content-specific, a query about a single title, not a stream of your viewing behavior. It is not nothing, but it is small and it is yours to switch off.

What Genuinely Stays Local (the Part That Matters Most)?

The content, which is the whole point. On a self-hosted setup, file and document contents, photo and video libraries, message bodies, and AI prompts and outputs stay on your hardware, and so does the bulk of behavioral telemetry. The continuous stream describing what you opened and when, the thing cloud SaaS collects as its product, is simply not generated for a third party.

It would be dishonest to catalog the leaks without being equally precise about what stops leaving, since that is the substance of the self-hosting case.

File and document contents. Files in a self-hosted Nextcloud or Seafile live on your disk. The sync payload moves between your devices and your server, not through a provider that can read, index, or train on it.
Photo and video libraries. A self-hosted Immich keeps the actual images and their analysis, the faces and places, on your hardware. There is no cloud copy being scanned.
Message bodies. On a self-hosted chat server, message content stays on your server, and with end-to-end encryption it is unreadable even to the server admin. (As the last section noted, federation metadata is the exception, not the message content.)
AI prompts and outputs. When a model runs on your own machine, your prompts are processed locally and never transmitted to a third-party AI service. This is the cleanest "no pipe" case, because the inference itself happens on-device.
The bulk of behavioral telemetry. The continuous record of what you opened, when, for how long, and in what order is simply not generated for a third party, because there is no third party in the loop for the core app function.

The structural contrast is the takeaway. In the cloud model, your content is the input to someone else's system, and your privacy depends on that company's policies and incentives. In the self-hosted model, your content is the input to your own system, and the only things that leave are the connection-level residues from the earlier sections, most of which you can shrink further.

Which Remaining Leaks Are Choices, Not Trust?

Nearly all of them. Each residual leak has an architectural answer that does not require trusting a privacy policy: run your own encrypted DNS resolver, enable ECH where supported, use ID-only or UnifiedPush for notifications, connect your devices with WireGuard instead of a hosted tunnel, and supply local metadata to stop title lookups. In the cloud you reduce exposure by reading a policy and hoping; self-hosted, you reduce it by changing the wiring.

The reason this whole topic resolves to privacy-as-architecture is that the residues each have a lever, and pulling the lever is a structural change, not a promise you are asked to believe.

Leak	Architecture lever
DNS metadata	Run your own resolver and encrypt DNS (DoH/DoT). Query names stop being exposed on the path, and no third-party resolver logs you. A self-hosted resolver gives a zero-logging relationship you can verify.
SNI metadata	Enable ECH where it is supported. Partial coverage today, but a structural fix to the last handshake leak rather than a policy.
Push notifications	Use ID-only push, so Apple and Google relay an opaque ID and not the content, or UnifiedPush with a self-hosted distributor to remove or minimize the third-party relay.
Remote access	Connect your own devices with WireGuard, where the keys live on your machines and no third party sees your traffic or connection metadata. Tailscale adds a coordination server that knows device IPs and topology but cannot decrypt the traffic; Headscale is the self-hosted coordination plane that removes even that.
External lookups	Supply local metadata (for example Jellyfin .nfo files) to stop the title queries.

The pattern is the entire argument. In the cloud, you reduce exposure by reading a policy and hoping the company means it. Self-hosted, you reduce exposure by changing the wiring, and then you can confirm it with a packet capture on your own network. That is the difference between a privacy promise and privacy by structure.

This is where the Companion Intelligence approach lives. On a Companion Core, the AI and the apps run on hardware you own, so the content (your prompts, files, photos, messages) has no pipe to leave through. That guarantee is architectural, not a policy you are asked to trust. The connection-level residues this post documents are the honest remainder, and CI leans on the same levers rather than a promise: a self-hosted resolver with encrypted DNS, Tailscale for the owner's private remote access, and a Cloudflare Gateway that provisions per-app HTTPS addresses for reaching your apps. If you want the longer version of why cloud convenience quietly compounds its costs, we wrote about the hidden costs of cloud convenience, and on doing this yourself, running your own AI without the cloud covers the adoption side.

The honest line is the one worth keeping. Self-hosting does not make you invisible on the internet. It makes the part that describes you stay home, and it turns the rest into decisions you can inspect. If that trade is the one you have been trying to think through, the people comparing real setups, real DNS logs, and real push configurations are in the Discord, and that is the room where this stops being theory.

Frequently Asked Questions

Does self-hosting mean no data leaves my network?

No. Self-hosting stops your content from leaving: files, photos, message bodies, and AI prompts stay on hardware you control. But the device still makes baseline connections any networked machine makes: DNS lookups, certificate revocation checks, and time synchronization, and the operating system may phone home on its own. Self-hosting silences the application data layer, not the existence of the network.

Can someone see which sites my server connects to even if everything is HTTPS?

Often yes. HTTPS hides the content, not the connection. Unencrypted DNS reveals the hostname you are resolving, and the TLS SNI field has historically named the server in plaintext in the handshake, so an ISP or on-path observer can learn the domain. Encrypted DNS (DoH/DoT) and Encrypted Client Hello close most of this gap, but ECH coverage is still partial and is blocked in some countries.

Do self-hosted apps still send telemetry?

Some do, but on your terms. Most apps check a release endpoint for updates. Mature projects ship telemetry off by default and let you choose what reports: Nextcloud telemetry is disabled by default, and Immich moved its version check to a dedicated endpoint to avoid every instance contacting GitHub. Unlike the cloud, where telemetry is the product, here it is a setting you own and can watch with a DNS log or egress monitor.

Does using a mobile app with my self-hosted server send data to Apple or Google?

By default, usually yes, for push notifications. A self-hosted server cannot push directly to a stock phone, so it hands the notification to a push proxy that forwards it to Apple (APNs) or Google (FCM). ID-only mode reduces this to an opaque message ID, with the device fetching the content from your server, and on Android UnifiedPush can avoid Google's FCM entirely. On iOS, fully bypassing APNs is not realistic for most apps, and some Android apps still reach FCM even under UnifiedPush.

How can I check what my self-hosted server is sending?

Watch the traffic. A DNS log (for example a Pi-hole or your own resolver) shows every hostname your server resolves, and an egress monitor or packet capture on your own network shows the outbound connections directly. This is the core advantage of self-hosting over the cloud: you cannot packet-capture a cloud provider's servers, but you can capture your own. The architecture makes inspection possible, though doing it is work most people skip.