Choosing a metadata standard

Tagged:

MARC, MODS, MARCXML, TEI, EAD, Dublin Core, METS, RDF, topic maps, ETD-MS... metadata standards abound. How to pick the right ones?

Shockingly, technical merit is close to the bottom of the decision stack. Many other concerns come first.

  • What is the problem domain? Pick the right tool for the job—or at least discard obviously wrong tools. If you are marking up metadata for electronic theses, EAD is not going to help you, designed as it is for archival finding aids. ETD-MS is what you want. METS and DIDL are at heart administrative and structural metadata; they are designed for fitting complex groups of digital files together. Don't look at them if all you need is a standard for bibliographic-style descriptive metadata. If you're trying to get your catalogue data out of MARC into something more hackable, you have a few choices—but EAD, ETD-MS, METS, and DIDL are not among them.

    Be aware, too, of the distinction between a metadata standard and a meta-standard that applies to metadata. Saying that you want "XML metadata" is meaningless unqualified; many metadata standards are expressed in XML (EAD, METS, and the TEI Header for starters) and others not natively tied to XML can still be expressed in it (such as Dublin Core and even MARC). Some standards contain "envelopes" for others; for example, METS can include or point to any number of different kinds of descriptive metadata.

  • Is the choice baked into the system? If you are starting up a repository that will emit OAI-PMH records, get used to Dublin Core, because OAI-PMH demands Dublin Core as the metadata base layer. If Dublin Core is good enough for your modest purposes, stop there; the decision has been made for you.

  • What are similar projects using? A literature search for the ever-present "How I Done It Good" articles may actually be useful. It can't hurt to contact some of the project principals directly, either; they will have invaluable information about tradeoffs and pitfalls.

  • What else do you have to interoperate with? Will any of your metadata go into your MARC catalogue? You'll want to make sure you can find or construct an appropriate crosswalk. Is OAI-PMH in your future? Examine Dublin Core. Do you want the metadata viewable on the Web? Then ask how easy it is to query from a database, or transform directly to HTML. Want other people to use it? Then pick something easily explained and manipulated—even if it's not a library-created standard.

  • What kind of usage infrastructure is there? Rolling your own infrastructure is tedious at best. The more training materials and venues that exist for a metadata standard, the easier it will be to learn and ramp up in production. The more software that already exists for creating, storing, querying, and displaying this metadata, the less you have to create.

    Be careful, though; if a given metadata standard is poorly-documented or only supported by expensive proprietary software, what expense and hassle are you locking yourself into if you adopt it? I rarely recommend topic maps because topic-map software implementations are so expensive, much though I love them in theory. Expensive, convoluted, and proprietary systems are also one reason many systems librarians dislike MARC.

  • What will this metadata do? If it has to be stored in a relational database, XML-based metadata schemas may not be the best choice (though many are at least feasible). If you need non-experts to create metadata (especially through a web form), highly granular or complex metadata standards are likely not the best choice.

  • Is it a good standard that encourages good metadata? Last on the list, but that doesn't mean to ignore it. Check for the right level of granularity, ease of creation, ease of access and comprehension, flexibility for hacking and recombination, solid best practices, and a lively support community.

Above all, don't panic. In these days of crosswalks, your library can probably recover from a bad decision without too much expense, as long as you're not tossing out an expensively-customized infrastructure along with it. (The travails of MARC are instructive here. Converting MARC to something else is already fairly feasible. The problem we have yet to solve is what to do about all our systems that depend on MARC!)

Read before you choose; pilot before you implement; evaluate wisely—and all should be well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

You write: "I rarely recommend topic maps because topic-map software implementations are so expensive..." I'm curious about this statement. Why, do you think, are Topic Maps implementations always so expensive? And what is "expensive", anyway?

Well, the last time I looked into serious topic map applications, the only software available was commercial -- and not off-the-shelf with relatively predictable costs, but highly-customized, support-intensive consultant-ware. Open-source applications, though they do exist, appear immature at present.

This is not an insult to topic maps, which I do love -- just a caution that the topic-map software environment is not as mature as libraries typically like to see. In a year or three, I may well change my tune, because I'd rather write XTM than RDF any day of the week.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.