Connections: April 2005

Wednesday, April 27, 2005

The Nikon RAW Format Rumpus

There's been considerable noise from many quarters over the Adobe vs. Nikon tiff about Nikon encrypting the white balance data in its model D70 RAW files. Adobe yelled that it wouldn't support Nikon's RAW format in Photoshop. Nikon came back with a not-entirely-satisfactory offer to make the SDK available to some developers.

My intent here isn't to recapitulate the debate here--which now appears to be at least somewhat overblown anyway. (For the edification of non-photographers, RAW is a camera (or least sensor)-specific format for the digital data captured by the sensor. Because it hasn't been further manipulated or compressed in a lossy way (a la JPEG), it's the highest quality way to store images on camera that support it.)

I do find one aspect of this mess particularly troublesome, however. And it's not Nikon's lack of openness--which is a boneheaded PR move that I can't see benefiting them. (Nor am I convinced that Adobe's motives in taking this public were necessarily pure.) Rather, it's the fact that Adobe could credibly invoke the DMCA (Digital Millenium Copyright Act) as the reason it couldn't decrypt Nikon's format.

I'm not a lawyer, but I'm far from convinced that the DMCA--and specifically its decryption provisions-- would apply here. After all, it's the metadata for the photographer's own data (picture) that's being decrypted. But, the fact that Adobe's can claim concern about violating the DMCA--and people widely accepted that concern as valid--should be concerning.

We don't really know what a judge or a jury could decide are the limits of the DMCA. Certainly "DMCA" gts thrown around by the anti-IP crowd as a bogeyman on the order of RIAA--but that doesn't mean it's right either.

Monday, April 25, 2005

Considering (ID3) tags

I've had occasion to be thinking about certain types of metadata recently. Actually, that's a somewhat pretentious and jargon-y way of stating it. To be both more precise and more colloquial--always a good combination--I've been working on tagging my digital music collection. Some of my findings and experience are doubtless specific to digital music or to my personal requirements and priorities, but I think much of it probably applies more broadly.

For example, why tags in the first place? Well, because there's a bloody lot of data--songs and other sound clips in this case. As a result, while you may want to make hand-crafted playlists for some situations, for your day-to-day background music you might well want the computer to do some of the work. And that means tagging the songs with relevant characteristics that an be mainpulated with some relatively simple rules to create playlists. The analogies aren't perfect with other sorts of media, but, in both cases, neither totally manual selection nor completely unaided computer search are completely effective.

I'll go into the specifics of what I've done and what I'm doing with my music collection in a future post. However, let's first consider some of the more general characteristics of such a scheme. I wish I could pretend that this was a top-down analysis. In reality, it's more like trial and error--and is still ongoing. Be that as it may:

Eschew unnecessary complexity. With multiple thousands of songs et al. in my database, each additional field could mean a lot of work entering data. If that field isn't going to be used effectively to create playlists consider ommiting it.
Automation is our friend. To the degree that a utility or your jukebox program can auto-fill a field, that's a big win. For example, J. River Media Center, which I use, can populate an "Intensity" field and a "Beat" field. Even though I've found that these computer-generated fields correspond only modestly to my personal perceptions of these attributes, they're essentially "free."
Build off the "standard" ID3 tagging infrastructure as much as possible. Unfortunately, once you get beyond the standard artist, album, etc. fields (that is, the truly stanardized ID3 tags), programs start having a lot of trouble interchanging the information. Even the seemingly standardized "Rating" tag isn't. My J. River Media Center can interchange rating information with my iPod, but a lot of tag editors don't seem to see the rating tags that it generates. Thus, for example, if you want to create a "subgenre" tag, you may want to consider keeping the standard "genre" tag and using something like an existing "keywords" field to hold the subgenre data.
Use fields that you can fill consistently, meaningfully, and without too much mental effort for each choice. For example, I've been toying with a "Mood" or "Situation" field but have had trouble filling in entries in a consistent way that I could then meaningfully use to build a playlist.
For anything like genre, subgenre, mood, etc., draw out a taxonomy or set of choices that you intend to use. Modify as required but at least you have a starting point.

Anyway, that was my retroactively arrived at starting point. More specifics coming.

Friday, April 22, 2005

Adobe-Macromedia Humor

This Translation From PR-Speak to English of Selected Portions of Adobe’s ‘FAQ’ Regarding Their Acquisition of Macromedia is both pricelessly funny and spot-on. I just want to know why, in this day and age, an "industry leading" program like Adobe's GoLive (which does admittedly do many things rather well) can get away with still being so crash prone after all these years.

Wednesday, April 20, 2005

Some Additional Good License Commentary

from Tim Bray as well.

I likewise have wondered all along about the dual-licensing option (i.e. license OpenSolaris under both the GPL and CDDL). It's a complicated issue though. How would the patent indemnification work under those circumstances? Would Solaris just be strip-mined for the benefit of Linux? (And should Sun ultimately care?) Would it be compatible with third-party closed-source linked to OpenSolaris?

In any case, it's clear that Sun and CDDL have prodded the GPL-is-the-only-true-Open-Source-license forces into action.

The Open Source license debate continues

Simon Phipps over at Sun has, what seems to me, a particularly nice overview of the three main Open Source license "families."

BSD-style licenses (of which I regard Apache v2 as the state-of-the-art) place no restriction on whether derived creations are returned to the commons. Thus the creative works of the community surrounding a commons created by a BSD-style license may be returned to that commons, may be applied to a different source-commons¹ or may be incorporated into a closed-source work.
GPL-style licenses require that derived creations, both resulting from the original commons and created newly around the commons, be licensed under the same license. The commons is thus enriched as it is used, but innovations created outside the commons can very easily be found to be licenseable only under the GPL and thus need to be compulsorily added to the commons - the artisan will often find that there is no freedom of choice in this regard.
MPL-style licenses (of which I regard CDDL to be the state-of-the-art at present) require that creations derived from files in the commons be licensed under the same license as the original file, but allow newly-created files to be licensed under whatever license the creator of the file chooses. This is a win-win; the commons is continually enriched, and the artisan retains the freedom to license innovations in any way that's appropriate.

This seems to capture things pretty well. I've actually changed my own thinking a bit. I've tended to look on the MPL (and CDDL) as part of the BSD "family" of licenses contra the GPL. I think the main criterion I was considering was friendliness to intermingling with proprietary code--which you can do with both BSD (essentially without restriction) and the MPL (so long as you keep the source code files separate).

I think that view's still valid, but a view that explicitly recognizes that the MPL and CDDL do require giving back to the commons at some level is more complete in important ways. Thus, I like this three bucket taxomy.

Wednesday, April 13, 2005

If We Buy It, Will Use Come?

I'm sure that we've all had the experience of buying something that we just had to have and then, after the excited opening of the UPS box and the equally excited showing off to your many (bored) friends, the long sunset of said device on the dusty shelf. I know I have.

I've come to the conclusion that personally applying the sort of "use cases" that companies like Intel try to apply to their designs can be helpful. Basically, this boils down to "what are you trying to do?" and "how would you use device X to accomplish this?" Some of ths same sort of thinking is described in Alan Cooper's The Inmates Are Running the Asylum.

That's all very theoretical. So let's look at a specific, personal example. Last time I had a new cell phone to buy (the old one was slipping into a coma), I got to thinking about a smartphone. It would be a bit pricey as would the services, but it would be cool. Time to think about usage models.

OK. I travel a fair bit and like to stay in touch. A smartphone sounds like the ticket, but what would it incrementally do that was important to me? Hmm. Email of course! OK, but I already have broadband at just about any hotel I'm staying at and at most conference sites too. I could even have it at the airport and at Starbucks if I were willing to pay the fees (which would doubtless be less than the smartphone data services would be). I have a nice, compact laptop (a Fujitsu P5000) that I always travel with. A laptop which, by the way, is far more friendly to navigating web sites and dealing with attachments than any smartphone would be.

I could go on with the personal details but, although they're relevant to me, they may not be for you. Cost tradeoffs, for example. This would have been coming out of my pocket. The bottom line is the value of trying to work through how you would use a given device that isn't already met in other ways.

Of course, it can sometimes be hard to predict or grok radically new usage models in advance. Especially when it comes to communication, so much depends on what others in your circle do. If text messaging is the norm, you text message. If it isn't, you don't.

That said, usage cases are at least a framework. If you can't think of a practical (or fun!) reason why you'd use a new gadget or piece of software to do things differently--you probably (though not certainly) won't.

Monday, April 11, 2005

The Centralized Ideal

So many of our network-inspired ideals imply--directly or indirectly--a hankering for a neat-and-tidy cenralized sense of authority and control (or at least a singlular guarantor of great QoS).

Historical sources and anecedents made such centralized command-and-control explict for philosophical reasons. Edward Bellamy's citywide pneumatic tube systems--contra the relative anarchy of the later real-life bicycle messengers of a later era--were at least the product of an explictly socialist utopian fantasy.

However, such an almost Jungian yearning for authoritative centralization also comes from less explicitly politically sources. Whether Isaac Asimov's Encylopedia Galactica from his Foundation Series or Jerry Pournelle's descriptions of brain implants that could ask any question of some (presumably) "knows-it-all" computer (Oath of Fealty), the implicit assumption has always been that there's a neat and correct source of all information.

We (hopefully) realize today that the reality is messier. There's no Encyclopedia Galactica; there's Google. Google (together with its sister search engines) give us the keys to the Web's treasures. Of course, they also give us acces to the "Net of a million lies." Nothing neat there.

Wikipedia Redux

Lat year, when Wikipedia was still a discovery to be made even for many in the technorati, I effused praise. Since that time I've had a chance to use it quite a bit more as well as to read both cogent and highly informed (such as from a past editor of Encyclopedia Brittanica) and more emotionally-derived opinion. My intent here (for now) isn't to debate Wikipedia's governance model or how it could be better--but rather, a year or so later, how I feel about it. Is the Open Source model working in this case? Certainly quite a few people seem to be skeptical--even on such profoundly Open Source venues as Slashdot. Perhaps the raw democracy of Wikipedia clashes with the inherit elitism of Open Source projects where some "expert" (even if self-appointed) is the ultimate arbiter of what gets included.

Well, I still feel pretty good about Wikipedia. Up to a point. And with some specific reservations.

In general, when I'm looking for facts--or at least a jumping off point to find and confirm facts--about topics historical, technical, and cultural. Wikipedia's a pretty darned good jumping-off point. It may not be the ultimate authority and it may or may not have the level of detail that I need, but it's a solid start. And if the text, as twisted and pulled by the efforts of too many non-professional editors, isn't the crispest or the smoothest, that doesn't devalue the information too much. Encyclopediae never were Nabokov.

That said, Wikipedia users would benefit by understanding its limitations--just as they would by understanding the limitations of National Review, The Nation, supposedly authoritative scholarly books, or even the Encyclopedia Brittanica.

For example, it's been said that news is the first draft of history. News is inherently non-authorative. While the bombs are dropping, it's unrealistic to expect the writer to understand or know all the precedents, background, particulars, and consequences. The better journalists get most of the facts right and understand the context better than most, but their accounts are still inevitably incomplete.

Yet, Wikipedia effectively tries (through its community) to integrate news as it happens--and doesn't do a great job of it--for all the reasons that those journalists who write history do it years later.

Wilipedia also struggles with controversy. It's got various mechanisms that attempt to deal with the push and pull of radically different opinions. But, for any controversial subject, the ultimate result is (at best): "here are some of the arguments; make up your own mind." Perhaps the best answer in some situations where expert opinion is legitimately divided; much less so when a small but vocal minority strongly contests the majority view. These issues are particularly pronounced in the dark corners. The biographies of George W.Bush and Jon Kerry will at least be viewed and debated by partisans from both sides. More obscure religious, ethical, and scientific debates may n ot be.

In spite of these limitations, I find Wikipedia an increasingly valuable resource. But you'll use it best if you understand its limitations.

Friday, April 08, 2005

Open Source Incivility Redux

For the most part, I try to keep mostly out of the politics and inside-the-beltway debate around Open Source and concentrate on the technology and what it means for users. But every now and then it all bursts forth. Thus my Open Source Incivility note of a while back which garnered the expected volume of comments--some thoughtful, many illiterate and foaming at the mouth.

However, a blog by a fellow analyst and former colleague of mine, Steve O'Grady, prompts me to comment. Apparently Yankee Group's Laura DiDio has been getting late-night calls at home from the less civil members of the "community." I have no reason to doubt the veracity of these claims; one sees plenty of people online trying to track down cell phone and other numbers from other Open Source critics such as those involved with the Alexis de Tocqueville Institute.

I don't know Laura personally. I do know she's done a number of TCO studies concerning Linux and Windows that have the usual problems of TCO studies; the numbers depend a lot on the situation and therefore are hard to generalize. Thus I'm pretty skeptical of most studies that purport to show X cheaper than Y in the general case.

However, the "community's" critique of Laura started much more specifically. It started, as much of Open Source fervor currently is focused, around the SCO case. As one of the analysts who looked (under NDA) at snippets of SCO Unix code side-by-side with Linux code, she concluded that they were, in fact, identical. The community reaction was vicious. After all, Laura wasn't a programmer. Who was she to compare code?

Without recapitulating the whole matter, suffice it to say that there was identical code. It's unlikely that the matches have any significance from a copyright law perspective. But there are matches. I wouldn't sign SCO's NDA but they did show me the Linux code that was purportedly the recipient of the copying. It wasn't hard to track down the ancestral Unix code. And I have been a programmer. Not that one needed to be to see that various blocks of code matched quite clearly. Rob Enderle--who I realize many in the Open Source community also like to pillory--details the history quite accurately; initial claims of "there was no code copied" transmorgified to "the copied code doesn't matter."

I'm inclined to agree that it doesn't matter and that the quantity we've seen is small. But that's certainly little justification to pile on someone who merely confirmed the facts--there was some identical code--and even less to intrude on their personal life.

Thursday, April 07, 2005

Onfolio Serendipity

As I noted earlier, one of the things that annoyed me about Microsoft OneNote was its relatively weak tools for capturing information off the web. Basically you can do the usual cut & paste thing or you could snapshot part of the screen but that's about it. Even just capturing a full web page takes unofficial add-ons and, even then, you end up with an image file--i.e. something that you can't search.

But OneNote did get me thinking about capturing research materials from the web. After all, it's something I spend a lot of time doing as an analyst but don't really have a great system for. I end up with a messy combination of bookmarks (mostly unorganized), printouts, and saved HTML files (also mostly unorganized). Enter Onfolio.

In a nutshell, I'm absolutely sold on Onfolio as a research tool and as an RSS aggregator. It integrates right into your browser (Firefox in my case although it also works with Internet Explorer) as a sidebar. It can capture complete web pages and even complete web sites. It's searchable. It can export. One can even take notes with it although I haven't been primarily using it for that. Maybe I'll start now that I've decided to keep it. There's a free download of the beta 2.0 version available on the company's site. If you do any amount of web-based research (doesn't everybody?), definitely give it a look.

Here are some more details from one of the developers, Joe Cheng:

We provide the familiar hierarchical-folders organization model, but also provide extremely fast full-text search (including a find-as-you-type mode) and Search Folders a la Outlook 2003. We are also way more suitable for capturing stuff from the web, as that is really our primary focus. (Our actual note-taking abilities are more of an afterthought, compared to capturing existing files and web content, but should still be a lot better than what you're doing now.)

We also provide several ways to bulk export items. You can drag the root folder of a collection to the file system; your notes and captured web pages will become MHT files. You can use our Web Publishing feature to "publish" a collection to a local directory, which will give you HTML for your notes and captured web pages. And finally you can use our included cfs2xml command line utility to export a collection as XML + data files.

Onfolio 2.0 Release Candidate is currently available as a free download on www.onfolio.com, when the final version comes out there will be $30 and $100 editions (the latter is intended for professional researchers).

Wednesday, April 06, 2005

Tablet PC, Phantom PC

A few of my recent posts have had me passing over various software packages because I don't have a Tablet PC. Am I an oddball (at least in this particular regard)? Hardly. Although "pure," that is, no keyboard Tablet PCs have always seemed a bit of an oddity--most of us who have used computers for a long time can type faster than we can write--it really seemed to me a couple fo years back that hybrids would become one of those convergence devices that really clciked. Type on them like a regular notebook when you had hardcore typing to do. Fold the screen back and write on it when you had more graphical tasks to accomplish like annotating a Powerpoint slide.

But didn't happen. Perhaps the hardware never got reliable enough or close enough to the prices of regular notebooks. Perhaps Microsoft never put on the full-court press needed to get people really thinking about a new notebook model as suggested here.

In any case, it seems a bit of a pity. Even if we've become pretty good typists, there are a lot of things we can't easily do with a keyboard. And hauling our Wacom Tablets around really isn't a very good option.

Tuesday, April 05, 2005

Notes and Mind Maps

If someone hasn't preached the gospel of mind maps to you yet, don't worrry--they will. I'm not going to delve into them in any depth here, except to note that a lot of people are big devotees of them for note taking. Personally, I've fiddled with them a bit for organizing thoughts for presentations but I haven't found them as useful for taking notes. Perhaps it's because I don't use a tablet or Tablet PC--which is useful if not required. Perhaps it's because I tend to take down a lot of "data" when I'm note taking rather than a "sparse array" of connecting threads and thoughts. I suspect an editor is better for the former; a mind map may be good for the latter. This was brought to mind by a particularly interesting discussion on the subject over at Creating Passionate Users. For a free taste of what mind maps are about, you may want to give FreeMind a look.

Monday, April 04, 2005

Text Editing Rediscovered

So, what do you suppose I settled on after spending way too much time over several weeks testing various programs for notetaking? (OneNote and Evernote were just the tip of the iceberg; I tried out just about anything remotely relevant that was Open Source, freeware, or had a usable demo. Nothing quite floated my boat.)

Drum roll, please.....

A text editor. It's fast; it can restart with the same windows open; it can easily search in a directory; it can have multiple windows open etc. To my thinking, for this purpose--that is, notetaking rather than programming--Edit Plus works best for me. But I recognize that editors can be a very personal and idiosyncratic choice. In any case, Edit Plus has nots of nice features for this kind of use--like easy access to wordwrap on/off, a nice directory screen, and a fairly unclutered appearance. But, as I say, I'm not going to try to argue anyone into a particular text editor style. It's not free, but at $30 it's cheap enough.

Mind you, I'm not totally happy with this option. For example, a text editor doesn't let me associate other files with my notes by embedding it. Nor does it let me embed graphics. I guess I'll have to fire up a full fledged word processor for those occasions when I need something more. But it's feeling like what I'll use for now in this imperfect software world.

Evernote, A free OneNote Alternative

Evernote's a free notetaking program (for Windows) somewhat in the same vein as Microsoft's OneNote. It's not Open Source, but it's a free beta; the authors apparently intend to keep a basic level of the program as freeware while possibly charging for some more advanced feature. Free is good. But is the program itself good enough to replace OneNote?

For my needs, the answer is a qualified yes. It's less polished in some ways. The particulars are a bit hard to describe but I'd call it "rough around the edges." Things just don't necessarily work like you expect them to. Also, a number of features are still missing--notably, any export function besides basic cutting and pasting. It also lacks a lot of OneNote's more sophisticated Tablet PC and file sharing support, but that was fine with me.

Evernote does have some nice points. For one thing, if its user interface is still a bit rough, it also avoids OneNote levels of garishness. More importantly, it also lets you view your notes on a time axis (rather than the usual hierarchical organization). This is a very nice feature that should be a standard. After all, how many times do you know about when you took a note about something, but not where you put it.

However, at the end of the day, Evernote and OneNote are alike in a way that I've decided is probably a showstopper for me. They lock your notes up in a proprietary file format with no way to do a mass export to something more standardized. Maybe if I were to find a program that just blew my socks off and would automatically export all my notes to a hierarchical collection of Rich Text Format files or whatever I might change my mind. But, for now, I've shifted my focus to organizing standalone files in standard formats.