Monday, April 25, 2005

Considering (ID3) tags

I've had occasion to be thinking about certain types of metadata recently. Actually, that's a somewhat pretentious and jargon-y way of stating it. To be both more precise and more colloquial--always a good combination--I've been working on tagging my digital music collection. Some of my findings and experience are doubtless specific to digital music or to my personal requirements and priorities, but I think much of it probably applies more broadly.

For example, why tags in the first place? Well, because there's a bloody lot of data--songs and other sound clips in this case. As a result, while you may want to make hand-crafted playlists for some situations, for your day-to-day background music you might well want the computer to do some of the work. And that means tagging the songs with relevant characteristics that an be mainpulated with some relatively simple rules to create playlists. The analogies aren't perfect with other sorts of media, but, in both cases, neither totally manual selection nor completely unaided computer search are completely effective.

I'll go into the specifics of what I've done and what I'm doing with my music collection in a future post. However, let's first consider some of the more general characteristics of such a scheme. I wish I could pretend that this was a top-down analysis. In reality, it's more like trial and error--and is still ongoing. Be that as it may:
  • Eschew unnecessary complexity. With multiple thousands of songs et al. in my database, each additional field could mean a lot of work entering data. If that field isn't going to be used effectively to create playlists consider ommiting it.
  • Automation is our friend. To the degree that a utility or your jukebox program can auto-fill a field, that's a big win. For example, J. River Media Center, which I use, can populate an "Intensity" field and a "Beat" field. Even though I've found that these computer-generated fields correspond only modestly to my personal perceptions of these attributes, they're essentially "free."
  • Build off the "standard" ID3 tagging infrastructure as much as possible. Unfortunately, once you get beyond the standard artist, album, etc. fields (that is, the truly stanardized ID3 tags), programs start having a lot of trouble interchanging the information. Even the seemingly standardized "Rating" tag isn't. My J. River Media Center can interchange rating information with my iPod, but a lot of tag editors don't seem to see the rating tags that it generates. Thus, for example, if you want to create a "subgenre" tag, you may want to consider keeping the standard "genre" tag and using something like an existing "keywords" field to hold the subgenre data.
  • Use fields that you can fill consistently, meaningfully, and without too much mental effort for each choice. For example, I've been toying with a "Mood" or "Situation" field but have had trouble filling in entries in a consistent way that I could then meaningfully use to build a playlist.
  • For anything like genre, subgenre, mood, etc., draw out a taxonomy or set of choices that you intend to use. Modify as required but at least you have a starting point.
Anyway, that was my retroactively arrived at starting point. More specifics coming.

1 comment:

Anonymous said...

Interesting article. I have the exact same problem in terms of Subgenres (Hip-Hop: gangsta rap, southern, NYC alternative rap etc.)