Teaching adversarial thinking

In case you missed it: A couple months ago a law prof brought on the wrath of academic Twitter by suggesting that students spend a week eavesdropping on the conversations of others to listen for people betraying their own security and privacy, a thing that people quite commonly do. Some of academic Twitter—self included—was initially entranced, until other parts of academic Twitter asked whether casual snoops (or even not-casual snoops) was really an okay thing to turn our students into? Especially when many of our students are still so unaware of the workings of privilege, such that snooping can take on exceptionally sinister overtones applied to certain populations?

So the initially-entranced folks, self included, backed off our initial enthusiasm, and the furor seems to have mostly died down. I, however, am still stuck with a pedagogical problem: as an instructor in introductory information security, I actually do have to teach people to snoop on, and even attack the privacy and security of, other people and systems. I know that sounds horrifying. I know it does! And it definitely gets into some pretty dark-gray gray areas. But stick with me just a bit longer while I explain.

Over a longish period of information-security work, it’s become clear that the only way to have any confidence at all that a system (in the large sense, so not just “a technological system” but “a technosocial system, emphatically including the people involved or even enmeshed in it”) is secure or private (not, of course, the same thing) is to test it by attacking it:

  • To test whether deidentification suffices to anonymize a dataset (spoiler: it rarely if ever does), researchers try to reidentify one or more people in it, often using additional available data to test reidentification via dataset overlap. See, for example, the Narayanan and Shmatikov paper that doomed the Netflix recommender-system contest.
  • To test the security of a given piece of software, you ultimately try to break it. Yes, there are tools (e.g. “Google dorks,” “vulnerability scanners,” “fuzzers,” even Shodan) to locate obvious or common problems, but they’re not enough. A creative, lateral-thinking human being is much better at finding exploitable holes in code than a computer.
  • To prioritize and test for holes in systems (again, “system” writ large), you first think like an adversary—what are the crown jewels in this system, and how would someone who wants them attack the system? This is called “threat modeling,” and thinking-like-an-adversary is a crucial part of it; without that, you end up with what Bruce Schneier calls “movie-plot threats” while ignoring the gaping system problems right under your nose (as, for example, Equifax certainly did). A crucial insight in threat modeling, captured concisely in this xkcd cartoon, is that your enemies always attack with the simplest method likely to work.
  • And once you have your threat model, you test how well your system resists it by, well, attacking your system in the ways you have identified it to be potentially vulnerable! This often happens in the form of “penetration testing,” which can be done on physical systems, social systems, technological systems (such as networks or software), or any combination of the three. My favorite example of a pentest that goes after all three types of system is this absolutely astounding Twitter thread, which I use in my intro course, and after which I named the class’s messing-around server “Jek.”

So I can’t get around it. If I’m to prepare students to take information privacy and security seriously, never mind enter actual infosec and privacy careers, I have to show them how to think like a Garbage Human (which is how I often phrase it in class), and I have to show them how to attack systems (writ large). How do I do this without turning them into Garbage Humans themselves?

This isn’t exactly a new problem in infosec, of course; the learn-to-defend-by-attacking paradox is the earth out of which Certified Ethical Hacker, CIP{M|T|P}, and similar tech-plus-thinking-about-law-and-ethics certifications grew. It’s not even a new problem generally—if we were to strip academe of everything that could be used to Garbage Human, how much of academe would be left? (Yes, yes, plenty of computer scientists claim that computer science would be left. Those computer scientists are wrong, wrong, wrong, wrong, wrong about that.)

What I ended up doing, because I felt more than a little bad about accepting the law-prof’s assignment idea so uncritically, was going back through my syllabus, assignments, and class slides looking for how I’d approached gray areas and put guardrails around students’ growing potential for Garbage Humanning. What I found fell into an actually rather small number of techniques:

  • Clearly and often laying out stuff that’s either illegal or so Garbage Humanny that it should be. For example, I use altering physical mail envelopes as an analogy to various address-spoofing attacks… but I also explicitly point out that mail tampering is amazingly illegal in the US and they shouldn’t do it. In person in the classroom, I am not at all shy about labeling certain practices Garbage Human territory.
  • Giving copious examples of how real people and organizations have been harmed by attack techniques. I can’t control whether my students use what I teach them to Garbage Human. I can control whether they can reasonably use the excuse “I didn’t know this could hurt anybody!” and I definitely try to.
  • When students in my class perform actual reconnaissance, attack, or forensics maneuvers, they’re doing it on me, on themselves (a good habit to get into! and certainly how I prep any assignment where they’ll be looking at me or my data), or on canned datasets created for the purpose (yes, I use the Greg Schardt/Mr. Evil dataset, for lack of one that’s more recent). They’re not doing it on unwitting and possibly-extra-vulnerable targets. Again, the techniques they’re learning absolutely can be repurposed for Garbage Humanning—but I’m clear that I don’t want them doing that, and I don’t give them any actual practice kicking down.
  • Keeping the emphasis on “attack to defend” throughout. They’re not learning adversarial thinking and attack techniques to turn into Garbage Humans, but to equip themselves to defend themselves, their loved ones, and those for whom they are in some way responsible against the depredations of Garbage Humans.
  • Being open about my own dilemmas vis-à-vis Garbage Humanning. For example, I am unbelievably tempted to pull a Narayanan-and-Shmatikov on the Minnesota learning-analytics dataset, the one from several Soria, Nackerud, et al. publications. Even though I don’t actually have that dataset (and don’t want it, good gravy, what a terrifying responsibility), I’d bet Large Sums of Money that knowing the cohort entry year (which, yes, they published) is enough all by itself to find some folks in the dataset via LinkedIn or a date-bracketed Google dork against the University of Minnesota’s website, and I might even be able to find some folks in their painfully-low-n outlier groups. Possible? Unequivocally. Absolutely without question possible. I’m not even good at reidentification and reconnaissance techniques and I am absolutely sure that I can do this. Ethical? … Well, that’s a tough one, which is why I haven’t actually done it.

Is this enough? I don’t know. I’m certainly still kicking the problem around in the back of my head, because if I can do better than I’m doing, I want to.