Encrypting Internal Networks
Since writing this post, I've run across a number of examples of other companies and organizations doing the same thing. While setting this up for a home network isn't quite the same as setting it up for an office, seeing prior interest in the idea gives me some confidence — and reminds me that most ideas aren't new.
I'm in the middle of building a set of distributed backup servers for my family. These are devices that get plugged into their networks. Devices on the network are then allowed to pair and then send files to the device, which compresses, encrypts, and backs them up to remote server. Files are then synchronized across devices, Dropbox-style.
My family isn't particularly technical. My parents unplug their router every night to save energy. My siblings have multiple IOT devices that take up common IP addresses. And everyone is spread across multiple operating systems.
When these devices are deployed, I'm not going to be there to manually find their IPs. I'm also not going to reset them whenever one of my family members moves. So I need each device to be able to handle setting its own hostname whenever it connects to a network. It needs to be short, memorable, and consistent. It also needs to work on every one of their devices — no editing hostnames on individual devices, or hoping that their router supports managing this.
I also want these devices to communicate over SSL, because I don't trust the security of the networks they'll be sitting on. I'm not guarding from every piece of malware — if someone's computer gets a virus, fine. I just don't want literally everyone on the NAT to be able to look at what they're backing up. Obviously, importing self-signed certificates or setting up a certificate manager on their devices is completely out of the question.
I'm going to be talking about how we can build a system that meets all of these requirements. But first, let's go over a few potential objections to the requirements themselves.
SSL is overkill, your NAT is the security
How many 3rd-party IOT devices do you have sitting on your home network right now? How many of them regularly connect to external sites? Are you sure that they're not vulnerable to malware? Do you have a smart TV (ie any modern TV) run by an amoral company that spies on you? Do you trust your TV not to packet-scan your network? When guests come over to your house, do you put them on a guest network instead of your personal wifi, in case their devices are compromised? Are you certain that your guest network is actually isolated from your private network?
NATs are an important part of security, but we want to be responsible and practice defense in depth. We're all adults who realize that network security is a continuum, and we care about adult things like DNS rebinding attacks.
Keep in mind, I'm also not just worried about security on curated networks that I control. I'm worried about security on networks that I don't control. I want to be able to walk into a coffee shop, connect a device to that public network, and trust that every single communication I have with that device will be encrypted.
LAN device communication should happen through a centralized, remote server
The first problem here is that maintaining external servers adds extra complexity to my architecture. I don't like complicated things. Complicated interfaces raise my risk of security vulnerabilities. They make it harder for me to quickly iterate or test changes. They make deployment into a chore.
They're also more expensive. Part of the point of keeping my infrastructure local is that it costs less. I know I'll experiment more if I'm not running as many cost calculations when I'm planning a project. A low-powered Raspberry Pi can run on pennies a day. A Linode server costs, at a minimum, $5 a month.
Finally, I disagree that it is beneficial for users to be sending large amounts of data over the Internet. I want devices that can still be controlled even when you lose broader connectivity. In this particular instance, I am synchronizing data across multiple networks. But many of my future projects won't send any sensitive data off the network, ever — including user commands.
I will be using a few remote calls to pull this off, but they'll be coordinated using an extremely small, scalable server that will be cheap and reusable across multiple projects. It will never send commands to any of my devices, so the risk of a remote attacker using it to gain access to my local devices will be basically nil.
The theory
DNS is arguably a bad system for the overall web, but it has some advantages: it works, it's pretty reliable, and there are tons of free providers. Let's use that.
There is no rule that says a DNS record can't point to a local IP address, and I'm certainly not the first person to take notice of that. Linksys routers for example take regular advantage of DNS rerouting, but because they're… well… routers, they usually just set themselves as your DNS server to do so.
We're going to take advantage of Cloudflare's free DNS services to make our stuff work. We'll buy one top level domain for $10 a year, and we'll use subdomains for all of our little hacky projects. For large commercial ventures we'll need to set up an enterprise account or roll our own DNS resolver. But for personal projects, this will be entirely sufficient.
All of the DNS settings will be set by the device itself whenever it connects to the Internet. In short, this means that whenever we plug a device into the network we'll be able to use a top level domain name to connect to it. Every device will get its own subdomain — nothing will be shared.
Importantly, because we're using individual domains, we can now add per-device SSL encryption. For obvious reasons, LetsEncrypt does not issue SSL certificates for intranet domains. But it will issue SSL certificates based on DNS records. We can use them to verify ownership of our domain, and then point that domain at an internal IP address.
So once we set up DNS, the plan is to set up the device or local service to automatically connect to LetsEncrypt and manage its own encryption. This means that we won't ever need to store or transfer certificates around, which would be its own potential violation of LetsEncrypt's terms (and just a bad idea in general).
The point
Serving our apps off of an auto-updating external subdomain instead of directly from an IP address means:
- Anyone on the correct network can connect to our service from any device without memorizing an IP address.
- We can plug our device into (nearly) any network and connect to it just the same, even if the IP address changes.
The big benefit here is the overall plug-and-play nature of the setup. Nobody has to install anything. As soon as they're connected to the network, everything just works.
If we're serving every single one of our internal sites with their own externally issued SSL certificates:
- New members on our network can connect securely without going through warning prompts or installing anything on their devices.
- No one else on the same network will be able to intercept or modify our traffic, meaning we’ll be able to guarantee at least a small degree of safety even on potentially hostile networks.
- If anyone else tries to pretend to be our service on a separate network, users who connect will get a giant warning because the certificates won’t match.
That last point is particularly important. Our certificates will be generated locally, on-device. They will never leave that device, and the device itself will handle proving to LetsEncrypt it has control of its subdomain.
If we were connecting to an unsecured IP address, any other network we connected to would have the ability to place their own service at that IP address. We might be able to see that we were on the wrong network and avoid browsing to that IP, but any chron jobs, background processes, or already opened browser tabs wouldn't be as smart. But those connections will be smart enough to abort or throw warnings if served an invalid SSL certificate.
A few messy details
There are multiple ways we can handle setting DNS records. Every service on our network needs to be able to check its own IP address and update its DNS records to point to it. In order to keep SSL completely localized, the device also needs to be able to complete DNS challenges on its own.
The first way we can do this is just connect the device directly to Cloudflare. This is by far the worst method, because it means that any one compromised device will give an attacker access to the full domain. That would compromise every subdomain (and thus every single service) we were deploying. We want to allow services to update their own records, but we don't want services to be able to update records for anyone else.
The better way to move forward is to set up a separate service that privately handles connections to Cloudflare, and that protects devices from each other.
When the service hands out a new subdomain, it should also hand out some kind of a domain-specific authentication token for future updates. All of the security best practices for keys and passwords apply here.
DNS caching is a thing. Cloudflare claims to be able to push out a DNS update within a few seconds. It's possible that there are private networks or setups where DNS records are cached for far longer. My guess is that for most people, this problem will be rare. But it could be a problem.
Finally, it's worth reiterating one last time, we want to be using a separate domain and a separate SSL certificate for literally every service. You absolutely should not putting a wildcard certificate on your domain. That would be a violation of LetsEncrypt's policies, and it would just be plain insecure.
The whole point is that an attacker shouldn't be able to access to our entire network by infecting one device. So if we're not going to be issuing brand new subdomains and certificates on every device, then why are we going to any of this trouble in the first place?
Where next?
This post has been heavy on theory, but not on details.
I've verified that this works, and am in the process of moving over parts of my own internal network to this architecture. Assuming that the theory holds, I'll eventually release the tools I'm using under Open Source licenses. At that point, I'll write a more technical post breaking down exactly how everything works.
If you've noticed a mistake or a security risk in this article, please let me know. I don't want to push infrastructure onto my network that makes me less secure.
In the absence of credible objections or concerns, my goal is to make it trivial for literally every single server within my NAT to be running SSL from its own domain. I think automation is an important part of this goal. For a local deployment, I should be able to run one command to provision a new subdomain and generate a new SSL certificate.
As I move more and more towards self hosting local services, it's important for me to treat internally hosted services with a much greater level of care and respect. The problems I attempt to address here are an important step towards a world where intranet services are first-class citizens alongside public servers.