What You’re Up Against

There’s a lot of blue-sky thinking in institutional repositories (IRs) right now. It’s usually what informs the planning process—and lest we forget, planning processes don’t usually include librarians or consultants who have actually done the work to run an IR.

When you hear these things, look out! Because this is the reality.
The software is free! The software is useless.
Faculty have all their work just lying around—they’ll love to have a place to store it! If it’s important, it’s been published. If it’s not, they don’t know where it is and don’t care.
It’s essential infrastructure! Nobody needs these things.
The metadata’s easy, just Dublin Core! The word “metadata” scares faculty.
We’re fixing scholarly communication! Publishers have us outgunned with our own faculty.
All we have to do is set it up! Maybe a little marketing… faculty will do the rest! Faculty won’t hear about it, understand it, or do anything with it.
All we need is a mandate! We can’t tell faculty what to do.

Faculty draw a careful line between what’s important and what isn’t, and they don’t leave important stuff just lying around—or they don’t think they do, anyway. It’s really no different than the print world. When a faculty member retires, does he come by with tidy files? No, with big ugly dusty unorganized boxes! His computer is no different! As for “essential infrastructure”—I’d love to see someone try to get Cliff Lynch to repeat that. I suspect he’s already embarrassed by it.

What’s worse, there is no good repository software. There is no good repository vendor service. By “good” I mean “something that faculty are going to find useful and flock to on their own initiative.” It’s all broken. And why is it broken? It’s broken because the fundamental ideology on which we opened these services was deeply mistaken. What did we believe? We believed in our own little field of dreams: “if you build it, they will come.”

This was a nice idea! I love this idea! But it was wrong. We built it. They aren’t coming. I’m sure all of you here have read as much of the literature as I have, so this probably isn’t news! But it still is news to our library administrators, to the top thinkers in open access, even to the very programmers who make our software! The thing is, though, none of them gets blamed when uptake is low. That’s on us.

If I sound frustrated, well, I am. I won’t lie to you, this is a really hard business to be in right now. It will be until academic libraries reframe some expectations and think a little bit harder, and that’s not happening yet. Part of the reason I’m here is that I want it to.


So let’s talk about our faculty, and why they aren’t flocking to our services. A few of them are genuinely on our side, and we should hug them and love them and call them George and otherwise appreciate them, because they are the real true believers. Peter Suber, for example, is doing his level best to evangelize and counteract the fear, uncertainty, and doubt emanating from many quarters.

Physics and computer science are light-years ahead of the rest of us. The trick there is, they’re not engaged with us. They’re engaged with their disciplines and their disciplinary repositories. This is only natural; it’s what we should expect. But the build-it-and-they-will-come ideology didn’t allow for it, and so we don’t have a plan to work with it. Without that plan, they ignore us, because we have nothing for them. Anthropology, you have to love these people, their professional society sells out to Wiley/Blackwell, but do they go gently into that good toll-access night? They do not. They raise hell, they start new journals, they make themselves heard, and bless them for it.

Faculty senates are ratifying calls for faculty to retain some of their intellectual-property rights, and that’s great; if your institution hasn’t, get your library administration lobbying for it. Even this, though—none of these IP-rights documents as I’ve read them has been even symbolically linked with an IR. There’s no answer to the question “okay, I have my rights, now what?” Nobody is tracking what happens when the addenda get used. Nobody is systematically collecting materials covered by the addenda. So ideologically speaking this is great, a real step forward—but in pragmatic terms, it doesn’t help repositories at all.

These faculty, these activist faculty are, unfortunately, the few among their peers. Most faculty are hanging out on benches asleep waiting for the bus to pass by.

And this is because faculty don’t have the same concerns we do. Faculty do not think about pricing when they submit articles hither and yon! What they care about is their reputation in their field. They care very much about the health of their scholarly societies, which is another reason to be very, very careful when talking about open access to them. A lot of societies put all their cashflow eggs in the journal-subscription basket, and they’re in a very poor position to move to open access. Those are problems that are very salient to faculty. Library problems? Not so much. To some extent, that’s our fault; we’ve done a little too well at insulating them from the cost of journals.

A lot of faculty simply do not realize what their publishing agreements are making them sign over. They think they own their work when they don’t. This is a hard problem to solve, because—look, go up to a faculty member and tell him he hasn’t made the best possible deal for himself. Then get on the bus to the hospital when he pushes your teeth in. I exaggerate—but not much; these discussions get really contentious.

The other problem is workflow. Faculty are generally receptive to the idea of self-archiving, but they’re not willing to put much—or indeed any!—effort into it. Even the less-than-ten-minutes per article it takes to stick something in an institutional repository is like pulling teeth!

We have not as yet done enough to help with that. Most folks with IRs have now realized that the tech is the easy part, it’s the services that make or break the project. And the state of automation is also poor; we need to connect a great big pipe between faculty desktops and IRs, and we just haven’t done it yet.

So we go to faculty saying, deposit your work, here, it’s just a few web forms, no big deal. And they’re like, keystrokes? I have to type actual keystrokes? If you look underneath, though, there’s some other stuff going on there that you have to contend with. We’re asking them to do something they’ve genuinely never worried about before!

  • Faculty have never preserved their work before. Why should they have to do it now?
  • Do they know where their work is?
  • Do they own rights to it? Do they care?
  • Our interfaces are horrible. All of them.
  • Do they trust us to do this? Do they know we do it?
  • What’s in it for them?

So what’s our value proposition? If they do this work, what’s in it for them? Will it help them get tenure? Will it help them get their next grant? Will it help them keep their in-progress research organized? Will it help them collaborate across institutional boundaries? No? No. Well, what good is it, then?

It’s not as though there aren’t repository-like functions that researchers need, from version control to access control to high-quality data storage to name it. But IRs can’t do any of that. And what IRs do—collection, metadata, and preservation—faculty are pretty sure they don’t need, and they’re not wrong.

So the “build it and they will come” ideology is bankrupt. It’s failed. Even though we predicated our services on it.

Policies and procedures

The persistence of this bankrupt ideology has a number of pernicious effects that you need to watch out for and be ready to counteract. It starts at the beginning, right during the planning process. There’s a lot out there about the planning process for an institutional repository, and frankly, ninety percent of it is complete garbage. Useless happytalk. Technical platforms, collection policy—that stuff is useless. The only collection policy is “you take what you can get.”

So libraries go through the planning process, and they think to themselves, we don’t need to throw a whole lot of staff at this! The faculty are typing the keystrokes! Well, I’ve explained why that one doesn’t fly. You need to be planning for what I call “mediated deposit,” which is where you do the keystrokes on behalf of faculty. I’ll talk more about that in a minute.

And then you have to deal with faculty and the whole intellectual-property rights thing, and for that I’ll point you to Peter Murray-Rust’s blog at Cambridge. Peter is a chemist and explains beautifully why this is such a problem.

And then there’s repository software. Remember that planning process? It probably assumed that out-of-the-box software was good enough for the purpose, because goodness knows that’s the line Stevan Harnad pushes. It’s garbage. You need to plan for developer and designer hours, and you need to be prepared to push open-source developers for features they don’t understand that you and your faculty need, such as document versioning. Let me tell you, I am not a popular person among DSpace developers. I got cut down to size by MacKenzie Smith the other week, in fact. But DSpace will not move unless we move it!

And then there are your wonderful colleagues at the library you work in. I work with some wonderful, wonderful people! Unfortunately, they have no idea what I do, no idea that I need their help, and no incentive to give it! And the library literature, and the library professional organizations, are not helping. IRs are a hot research topic right now, so I for one get poked a lot, like a lab rat. More surveys than you could shake a stick at—I could hire a whole full-timer just to deal with surveys and interviews! The thing is, all this poking never manages to tell me anything I don’t already know. It doesn’t give me strategies to deal with all the stuff I’m talking about today. And it doesn’t give me hope. So it’s really pretty useless.

In fact, one of the commonest reactions I get from fellow authority-obsessed librarians is, “Why do you have that stuff?” Now, what “that stuff” is varies. It may be student projects, learning objects, images, special collections stuff, whatever. But IRs were sold, again, on the “access to the peer-reviewed literature” ideology, so nobody has any real idea how completely dependent we are on the kindness of faculty who don’t have any particular reason to be kind to us!

It’s not just library colleagues, either. Anybody read the report from Ithaka on university presses?[1. Brown, Griffiths, and Rascoff. “University Publishing in a Digital Age.” Ithaka, 2007.] Did you see the remark calling IRs dusty attics? Ooooh, diss! And yes, all of this hurts. It is frustrating and demoralizing and it can really make you wonder why you’re doing this. Watch out for that. It’s not you. It’s not your fault. It’s the system you’re embedded in.

Now, there’s hope. It’s coming from England and Australia, mostly. You want to watch what the Monash University guys are doing with ARROW and DART and ARCHER and all that stuff.[2. Treloar and Groenewegen. “ARROW, DART, and ARCHER: A quiver full of research repository and related projects.” Ariadne 51, April 2007.] You want to watch the SWORD initiative, helmed in England, and all the nice code that Les Carr and the EPrints people are writing. That’s where hope is. It’s not here. Not yet.

After all this doom and gloom, I’m sure you’re wondering what I can possibly tell you that will help you get content into your IR. Well, I’m here to tell you that you have essentially two choices. One, you can build a service that is useful to faculty on faculty terms. How do you find out what faculty need? Well, I’d watch the higher-ed literature for mention of what’s being called “cyberinfrastructure” or “research computing,” because honestly, that’s where the action is. Failing that, focus groups can be educational, and you may be able to pair with campus IT to offer them.

The thing is, what faculty want is not what you got. You got an IR. They don’t want an IR. Your job, and the job of your developers, is going to be turning an IR into something they want. Usage statistics, versioning, a pipeline from existing campus storage, whatever—you have to figure out how that works. Alternately, if you have a cyberinfrastructure initiative happening on campus, you need to jump on that bandwagon. Trust me, you can’t help Big Science, because they take care of themselves. But you can help the people who aren’t big enough to be Big Science, and IT will be happy to offload them to you.

Your other option is to do it for them. Be a collection developer. Type the keystrokes. Canvass departmental websites and disciplinary repositories. Now, I warn you, library admin is not going to like this option, because it takes staff and ongoing commitment and both of those cost money. Still. One or the other, because the alternative is an empty repository. Nothing else works.

Oh, they’ll tell you other stuff works. Marketing? Marketing does not work. I will show you all the marketing that’s actually useful: a trifold brochure that you can hand people. Get a cheap color inkjet printer, 28-pound paper from the office-supply store, and go to town. Mandates? Yeah, right, in your dreams. I mean, if you see a chance to attach the repository to somebody’s review process, do it! But don’t waste your breath chasing mandates. That takes senior management, which you aren’t.

So if you’re still in this business after that, let’s talk software. If you have the choice, my suggestion is EPrints (and yes, I run DSpace). Its developers are on the ball, offering features that faculty will actually find useful. It’s got statistics that don’t suck. It’s got an author-contact button. It’s got embargoes built-in. Just really basic things that DSpace is too busy arguing about Java frameworks to code in.

No matter what you choose, though, you need to bargain for some developer and designer time for customization. Because of the way these software packages work, you’ll need customization help on every upgrade of the software, and upgrades happen pretty often because the software is young still. If you can’t get this help, pick up your toys and go home. Seriously.

Next, you need to be thinking about how to hook up the IR with other campus services, and again you’ll need your friendly IT folks for this. This is why I mention SWORD, which is a new web-services deposit gizmo based on the ATOM publishing protocol. It’s got plugins for DSpace and EPrints, and if you’re running one of those two packages, you want it. If you’ve got a local webspace thing, that’s what I mean—you want an “Archive It!” button in there. It won’t be easy. I don’t have these things yet, though I’m certainly pushing for them. But you can’t get what you don’t ask for.

Finally, you need a service model. A value proposition. Something other than the tired old stuff about citation impact and preservation capacity that faculty just don’t listen to. There are some possibilities, and no, you don’t have to do it all, just pick some things that you know you can do and do well.

  • Research assessment and faculty bibliographies. At Wisconsin we’ve been working on a little gizmo called the BibApp, in collaboration with UIUC, and we’ve found out that although faculty think it’s really important to keep their online bibliographies up-to-date, they’re too lazy to do the work. In a lot of disciplines, we can automate that kind of work through RSS search feeds from the bigger databases and indexes. Build a SHERPA checker, and throw a few students at downloading suitable articles for the repository, and you’re really onto something. Or we think we are.
  • Retirement archiving. It’s not big now, but it’s going to become a growth industry. Talk to your campus archivists and records managers. See what they’ve got that you can have.
  • Publishing and copyright are confusing as all get-out now and only going to become more so. All you need to get into this business is to be more up-to-date than faculty, and frankly, that ain’t hard.
  • I’ve talked about cyberinfrastructure already. You can’t build it on your own, but you can certainly try to be part of it. You can also reach out to your campus’s needs for preservation of small-conference proceedings and local publications. Figure out what you can offer that’s useful.

Dealing with your colleagues who don’t know what open access is, don’t care, and don’t even know how to introduce you to faculty: The first law is, you must make yourself a valued colleague. They won’t lift a finger for the IR, but you’ve won half the battle if they’ll help you just because you’re you. The second law is, you need to make a pest of yourself with your library’s administrators. This is because they have power and connections that you do not. You can’t stump for a mandate. You can’t require electronic theses and dissertations. You can’t allocate library staff time or funding. You’re not allowed to talk to provosts and deans and IT administrators. So start beating the drum now, and don’t stop.

Finally, I want to point out one last barrier. And that’s the profession, which is not serving us. Cataloguers have AUTOCAT, web librarians have web4lib… we’ve got nothing. We don’t have journals. We don’t have consistent conferences; the DASER conference was great, but it’s dead, and Open Repositories isn’t in the country most years. I’m not going to the next one; I can’t afford to. We don’t have a professional organization, or even a special-interest group. Not even a mailing list!

This is seriously holding us back. We aren’t sharing code; for this and other reasons, innovation in repository-space is frankly languishing except at universities with in-house developers. We aren’t sharing policies and solutions to policy problems—I get all kinds of email from people who are facing something novel and are ashamed of asking questions! This is absurd. We’re not sharing ideas, not on any large scale. And we’re not sharing the burden. This is a hell of a job to be in. Support is helpful. It’s not out there.

What makes all this worse is that we’re not being talked to. We’re being talked at. This is the first full-on presentation I’ve given about the experience of running an IR, and I’m one of the louder IR managers out there, honestly. If you want more people like me, and I’m not saying you do, but if you do, you need to speak up, on evaluation forms and when calls for papers come out.

I can’t fix this alone. I’ve tried. A journal I pitched in on folded before its first issue came out. I started a bulletin board and mailing list; they folded for lack of interest. I’m fresh out of ideas. I’d love to see some people here pick up the ball.

Thanks, and good luck out there.