Sherlock Holmes and Dr. John Watson strolled through my neighborhood on a pleasant September evening. What case they were working I have no notion, but I distinctly heard this exchange from my upstairs window:
“Small enough house, this,” said Watson, “and the yard is a horror, but the trim’s lately painted and the gutters new. Minor civil servant?”
“An academic librarian,” Holmes declared confidently, “with a professional interest in institutional repositories and, hm, quite likely other library technology as well. Possibly even one of Michael Gorman’s ‘blog people.’ Dried fruit is a favorite breakfast food, kept in a Whirlpool refrigerator—odd, that; I’d have thought this one a bit of a conservationist.”
“Remarkable, Holmes!” exclaimed Watson. “However did you determine so much without even a sight of the house’s inhabitants?”
“Elementary, my dear Watson,” said Holmes, drawing a Raspberry Pi out of his pocket. “I merely sniffed DNS queries on the house network.”
The Domain Name System (DNS) is the system that matches IP addresses to domain names. For example, as I type this, the domain dsalo.info is mapped to the IPv4 address 126.96.36.199. Any device needing to know this—to browse this website, for instance—sends a “DNS query” containing the domain name into the system, which (through rather roundabout means; here is a good explainer if you care to know more) determines and returns the corresponding IP address.
Here’s the thing. DNS queries and responses presently travel over the Internet in the clear, unencrypted. A teensy little Raspberry Pi computer can indeed sniff any network for them! This allows the Holmeses of this world to learn quite a lot about the Internet behavior of the sniffed network’s denizens, even if every single website they visit is HTTPS-encrypted. It’s much like the NSA’s cell-phone “metadata:” content is unavailable, but there’s plenty to be learnt without it.
The Not-so-Secret Royalty of the Internet is presently trying to fix this mess. Last I looked, there were three different proposals in the hopper for securing DNS queries from random detectives and other malfeasors. Though none of them is receiving unanimous acclaim, DNS-over-HTTPS appears to be something of a frontrunner due to ease of implementation.
Whichever proposal wins out, I would hope libraries would have the sense to implement it as soon as practical.
Below is the list of domain names on which Holmes based his deductions. I got the list by turning Wireshark loose to packet-capture my own home network one morning when nobody else was home, then saving the sniffed traffic as a “packet-capture” (colloquially “pcap”) file and running Wireshark’s
Statistics --> Resolved Addresses command on it. By all means see if you can reproduce Holmes’s deductions! As I tell my students, I’m a hypocrite: I deplore the Internet of Things, yet I have an internetted Thing in my very own home.
# Resolved addresses found in /Users/Dorothea/Dropbox/Courses/510/ExamplePCAP.pcap
# 72 entries.