Unizin Not-Common-Knowledge Data Model

I’m doing a talk this Tuesday for a campus IT conference. Should be a good time, for certain values of that phrase. I’ll post a link to the slides here afterwards.

While writing the talk—I’m one of those dorks who does script out talks word-for-word, though I do it in the slangy, choppy rhetorical style I actually talk in; academese is not my speech register and I don’t pretend it is—I ran across the Unizin Common Data Model, which if I understand correctly underlies the giant data warehouse for student data called the Unizin Data Platform. This will hold data from all Unizin member institutions.

To Unizin’s credit, they have a data dictionary publicly available, though every time I’ve tried to get just the table listing (or ERD?) it hasn’t worked. Still, the list of column/attribute names is there, and this list is a swift and daunting education in student Big Data.

See for yourself by all means, but here are some specific areas of the table I suggest looking at:

  • Everything from the Incident and IncidentPerson tables (conveniently, the table name is the first column in the data dictionary and is how the dictionary is ordered), especially the RelatedToDisabilityManifestationInd column
  • the LearnerAction and LearnerActivity tables, noting for the record that hashing the LearnerID is not anything like a sufficient privacy guarantee
  • the Person table and related tables, which are detailed to an extent that gives me nightmares

Have fun asking yourself why on earth a learning-management system needs to know all this… and considering the Equifax-level horror if there is ever a breach in it.

Kanopy and Elsevier: united in password mishandling?

My introductory information-security course contains both undergraduates and iSchool graduate students. Every once in a while I get to drop in a library- or archives-specific tidbit, and today (the first class meeting after Spring Break), I had two among all the other news:

Shortly after the Kanopy breach broke, Jessamyn West passed on a very important question from Dan Turkel to Kanopy on Twitter: “Are you [Kanopy] storing user passwords in plaintext?”

Let’s back up and examine that question a moment, shall we?

“Plaintext” is information-security jargon for “not encrypted.” “Encrypted,” for our purposes, means “changed such that the original data cannot easily (or ideally at all) be figured out.” So, when Elsevier actually broadcast passwords in plaintext to all and sundry via some web dashboard, it disobeyed one of the fundamental best practices in infosec. If Kanopy was storing its passwords in plaintext, that’s just as bad.

(How do you know if a user’s password is correct, if you can’t store it figure-outably? Well, you know exactly how you changed it. When the user enters their password, you just change it the same way you originally changed the stored password, at which point you can compare the results.)

Nobody is supposed to store passwords in plaintext! Ever! Much less broadcast them in plaintext to all and sundry on a web dashboard! (What you are supposed to do with them is… complicated, and keeps changing as password-cracking software and hardware improves. Check with your favorite infosec expert, okay? And consider multi-factor authentication.) So what Turkel was asking Kanopy boils down to “okay, you were caught being careless; exactly how careless were you?”

Kanopy never answered, at least not on Twitter. This… does not exactly inspire confidence. Nor has Elsevier’s post-incident public relations on Twitter, which as best I can tell has substantially amounted to “it wasn’t that bad!” “everybody else has breaches too!” and similar sad, disingenuous deflections of responsibility. There are best practices in handling security incidents—perhaps unsurprisingly, infosec refers to them by the term “incident response.” These are not them.

I hope to have more to say about incident response in time, because it’s a thing more libraries will find themselves stuck doing—including when our vendors should but don’t—and the first step is always “have a plan for it.”