A quick-and-dirty review of Metadata for Digital Collections, 2nd ed.

I was incredibly grateful to the libraries’ interlibrary loan folks during pandemic lockdown for getting me a scan of chapter 10 (“Designing and Documenting a Metadata Scheme”) of the first edition of Steven J. Miller’s Metadata for Digital Collections. I can and do bricolage a lot of things in my syllabi, but there just is no substitute for this chapter, not anywhere.

That said, that was the only chapter of the book I used during the pandemic, because its first edition, while fine for its day, was getting (shall I say) rather long in the tooth, such that bricolage offered better options. (I am not the sort of instructor who rushes reflexively to pick out a textbook anyway. I am extremely opinionated about instructional design; many textbooks rub me the wrong way. Some, of course, are just garbage from the word go.) So I definitely perked up my ears (and asked for a review copy) when the second edition came out last year.

Verdict: Not perfect, but aside from one truly distressing and Seriously Not Okay topic omission (discussed below), pretty good. Good enough that yes, I’m going to require it despite its cost and the issues I have with it. The unity of voice, careful audience-aware prose, and exceptionally useful apparatus (glossary, crosswalks, plentiful sample records) are often enough an improvement over my usual bricolage. The irreplaceable Chapter 10 of the first edition is Chapter 12 of the second, and it’s as irreplaceable as ever.

That said, it ain’t perfect. Notable imperfections, one major, the rest minor:

  • There’s no discussion whatever—seriously, not a single word—about inclusion in metadata or controlled vocabularies, which these days is… well, straight-up, it’s unacceptable. (Cis white male authors gonna cis white male, I fear. Where were the reviewers and editor here? This shouldn’t have got past them.) Bricolage can fill this gap, however; there’s plenty of brilliant work out there.
  • I’m not sad METS is gone (was it in the first edition, even? I don’t recall), but I am a little sad there’s not a practical discussion of how systems handle file chunking and ordering, because they do, they just mostly do it with filenames or system-specific setups, not metadata. A brief mention of PCDM and the Oxford Common File Layout might not go amiss, even.
  • Calling the RDF serialization N-Triples “a subset of Turtle” is accurate while conveying absolutely no useful information whatever. What students need to know is that N-Triples is a commonly-available output serialization for stored RDF online—if you click on a random website link to a linked-data representation of something, you’re most likely getting N-Triples—and its syntax is very verbose and restrictive.
  • There are a few relatively minor technical gaffes, nothing dealbreaking. (The XML declaration is optional, y’all! An XML document without it should parse just fine.)
  • I think the coverage of Qualified Dublin Core is somewhat out of proportion to its actual importance in the field.
  • I think (based on a lot of classroom experience) that the book mis-sequences RDF and XML. The grotesquely clunky design of XML namespace declarations is a lot easier for students to assimilate once they’re used to Turtle’s cleaner, clearer @prefix declarations. That said, Miller is careful enough to sandbox chapters in this book that it’s quite possible to assign the RDF chapter out-of-sequence, and that is what I plan to do.

My most longwinded (sorry) issue with the book is its near-obsessive OCLC bootlicking. In this our year 2023, ContentDM is the worst available digital-collections demo option for a textbook; it’s an obsolete proprietary crap sandwich, never mind that it belongs to likely the most evil, odious pseudo-non-profit in libraryland—and believe you me, there are several contenders for that dubious crown. Of the remaining ContentDM customers I know, not one is happy with it; most are meditating or actually implementing a migration off it. Omeka exists (and is far simpler for a harried instructor to let the class kick the tires on than ContentDM is—it’s a one-click install at my webhost). AtoM exists. Hell, DSpace exists. There are even a couple-three new contenders (one of which I actually plan to take a look at). The choice to advertise ContentDM is just embarrassing. Similarly, the choice to feature OCLC’s little-known, barely-maintained, and largely-useless OAIster.

In short: mostly-well-done update, solid choice for a metadata classroom; needs a chapter on inclusion, a few minor fixes, and the total cessation of toadying to OCLC.