Tagging
Executive Summary
Tagging refers to the process by which users assign terms meaningful to them to a resource in the online environment. The rise of social bookmarking Web sites have skyrocketed tagging systems into the mainstream.
What It Is
Tagging is the process of assigning personal keywords (“tags”) to resources by users. The related concept folksonomy is the set of labels that emerges from the tagging process. This term is a contraction of the words “folk taxonomy.” The rise of social bookmarking Web sites have skyrocketed tagging systems into the mainstream. Two of the most visible examples of this phenomenon are Del.icio.us, a site for tagging Web bookmarks, and Flickr, a site for sharing and tagging photographs. Many tagging systems require that tags be a single word, however, others don’t, and conceptually there’s nothing to prevent tags being multiple words long.
Many tags users create are essentially subjects for the resource, for example, a topic covered by a Web site, or the people or location present in a picture. Yet others are more administrative in nature, and due to the personal nature of tags, only meaningful to the individual who created them, for example, the name of a project to which a resource was relevant, “todo” or “unread” to remind a user to take some sort of action, or “español” to bring out other non-topical aspects of a resource.
Tagging receives criticism from library circles for lacking some of the benefits a predefined controlled vocabulary offers. The primary arguments are that tagging doesn’t offer synonym control or distinguish between two meanings of a word. Interestingly, the criticisms given generally don’t cover another, more powerful, feature of controlled vocabularies also lacking in tagging systems—known relationships between terms.
What Can Be Done With It
The tag lines of del.icio.us (keep, share, discover), and Flickr (store, search, sort, share) give insight into the goals and possibilities of tagging systems. The goal of tagging systems is generally for individual users to manage resources for personal use. Most provide some facility to share the tagging (and possibly resources) with others, and some store copies of resources for the user. The key is that tagging is a personal phenomenon—the primary goal is to assist an individual with resource management tasks.
Tagging sites have built upon the data created by users for that personal management purpose to provide other services, such as resource discovery. The nature of both the resources that tend to be managed by tagging sites and the tagging process itself support users who casually browse for interesting resources. However, many tagging implementers realize that synonym and homonym control could improve the services provided on top of tag data. A Web search should reveal various methods currently in various stages of implementation to attempt to identify (and in some cases, prevent) synonyms and homonyms in tagging systems, and identify relationships between tags.
Tag clouds (see, for example, the tag cloud for the Flickr all-time most popular tags) are graphic representations of frequently-used tags from a particular service. The size and weight of the font of a tag represents its relative frequency to the other tags. Tag clouds provide an at-a-glance view of the contents of a tagged collection.
Examples
Flickr is an online photo sharing site, which states “help people make their photos available to the people who matter to them” and “enable new ways of organizing photos” as its two main goals.
Del.icio.us is a site for tagging and sharing Web sites, including “our favorite articles, blogs, music, restaurant reviews, and more.”
Technorati is a popular site for exploring creator-tagged blog entries (among a few other things).
LibraryThing is an application of tagging to personal book collections. It can pull bibliographic information from the Library of Congress or amazon.com, saving users the time of entering this information manually. A free account allows a user to tag and manage up to 200 books. The site also features disambiguation mechanisms whereby users can tell the system that two books represent the same work, or two authors represent the same person. Perhaps LibraryThing’s most important and innovative feature is providing recommendations for new books to read based on the tags a user has applied to his collection.
PennTags is a tagging site intended for the University of Pennsylvania community. In addition to tagging Web sites, PennTags allows users to tag records from the Penn online catalog and individual journal articles. It represents an interesting move by a library into the tagging community.
Steve is an initiative from the museum community to harness the efforts of knowledgeable users in describing museum resources. Steve is unusual for a tagging initiative in that users don’t select their own resources to tag, but rather are presented with system-selected resources for tagging. It remains to be seen if the data produced by tagging in this type of environment results in the rich data provided by users describing their own resources.
Connotea is a tagging site intended for “researchers and clinicians” to “keep links to the articles you read and the websites you use, and a place to find them again.” It markets itself as a way for scientists to organize scholarly reference lists, and originated from the Nature Publishing Group.
Who Should Be Using It
A significant percentage of your patrons already are, in a variety of environments. Libraries should be thinking about how tagging technology can help improve our services. Initiatives like PennTags and LibraryThing show some of this potential. Many other applications are possible – which is right for your library?
Related Technologies
Libraries are familiar with various forms of terminology control, including controlled vocabularies, thesauri, classification, and ontologies. Some of the goals of each of these technologies, and tagging, are the same, although there are some important differences in scope and implementation. Which is appropriate in any given situation depends on many factors, and the decision between them should be made carefully based on a full understanding of each of the options. There is also no inherent reason any given system has to use only one of these and not the others. A library catalog could contain cataloger-supplied subject headings from an appropriate controlled vocabulary, and also allow user tagging of materials. Smart systems could use the two together to provide better retrieval for users. Embracing the benefits of tagging does not automatically or necessarily signal a rejection of traditional subject authority control.
More Information
Bearman, David and Jennifer Trant. “Social Terminology Enhancement through Vernacular Engagement,” D-Lib Magazine, 11(9), 2005. http://www.dlib.org/dlib/september05/bearman/09bearman.html
Hammond, Tony, Timo Hannay, Ben Lund, and Joanna Scott, “Social Bookmarking Tools (I): A General Review,” D-Lib Magazine, 11(4), 2005. http://www.dlib.org/dlib/april05/hammond/04hammond.html
Kroski, Ellyssa, “The Hive Mind: Folksonomies and User-Based Tagging,” http://infotangle.blogsome.com/2005/12/07/the-hive-mind-folksonomies-and-user-based-tagging/
Lund, Ben, Tony Hammond, Martin Flack, and Timo Hannay, “Social Bookmarking Tools (II): A Case Study – Connotea,” D-Lib Magazine, 11(4), 2005. http://www.dlib.org/dlib/april05/lund/04lund.html
Shirkey, Clay. “Ontology is Overrated: Categories, Links, and Tags” http://www.shirky.com/writings/ontology_overrated.html
You’re It! A Blog on Tagging http://tagsonomy.com/

Well done, Jenn!
My only quibble is with "known relationships between terms." My understanding is that some tagging systems are experimenting with creating (or at least suggesting) relationships based on clustering or similar algorithms. Obviously this won't always work, and it's more difficult to characterize the relationship than merely to recognize it, but that's different from saying that tagging systems cannot demonstrate such relationships at all.
Dorothea said, "My understanding is that some tagging systems are experimenting with creating (or at least suggesting) relationships based on clustering or similar algorithms."
Ab-so-lutely! Various tagging sites are experimenting with this (and synonym control, and homonym disambiguation, and...). I don't know that any is really in the lead or causing a revolution, but a quick Google (I'm allowed to say that, it's now officially in the OED as a verb) shows many efforts in this area. I tried to cover this in the entry, with:
"A Web search should reveal various methods currently in various stages of implementation to attempt to identify (and in some cases, prevent) synonyms and homonyms in tagging systems, and identify relationships between tags."
The great thing about TechEssence being a Wiki is we can make this clearer if we want. Suggestions for how to do that are welcome. :-)
On that note, I realize now I forgot to include the point that drove me to write about this in the first place: that library-style subject analysis and tagging aren't an either/or proposition - we should be looking for ways to use them in concert. I'll have to go back and add that in...
University of Pennsylvania, not Penn State.
LibraryThing is currently providing statistical correlations between tags and LCSH headings; I believe they plan to provide this to third-parties in bulk machine readable format as well. The idea of correlating tags to controlled terms seems potentially very interesting, suggeting using tags to supplement lead-in vocabularies among other things.
Georgia Tech is also working on a similar project. Feed it a term and it will try to give 'appropriate' LCSH or MeSH or whatever based on it. The opposite will (hopefully) also be true. "These folksonomic tags appear frequently for this given LC Subject Heading", etc. We hope to have enough data that we'll be able to start making this available as a service by the end of the year.
Oops, fixed now. Thank you!
Nice. Where are you getting your data from?