Links for 1-25-07

Private Tags in

It seems that has recently added a feature that allows you to save tags as private. It's not particularly obvious; to access it, you have to go into your settings and enable "private saving." This seems useful for a couple of scenarios:
  • You expose your "tag cloud" through a widget on your blog or elsewhere and you simply don't want to pollute it with sites that may just pertain to some current, short-term task. (e.g. if you saved some pointers to an item that you're in the process of buying).
  • You save some sites that pertain to personal content. Presumably, you have anything truly sensitive password-protected. But that doesn't necessarily mean you want to spread the link far-and-wide. Or you may just not want to publicize an interest in a particular type of information--whether health-related or otherwise.
Of course, you don't have to use to save sensitive links but it's such a handy tool that it's good to be able to do so. (With the caveat, which should go without saying, that any confidential information needs to be protected by more than obscurity (e.g. an unpublicized link) anyway.)

One of the commenters on this post suggests that an alternative to making posts public or private individually is to create separate users for public and private links. I haven't experimented enough to comment on which approach is generally easier. [UPDATE: My initial take is that having multiple user ids on is awkward to manage because of the way the site works. You have to do a lot of explicit logging on and off; the site leaves you logged on pretty much indefinitely from a given browser image and there seems to be no way to post a tag to a specific account other than the currently logged-in one.]

As an analyst, I keep on the lookout for better ways to visualize and organize information. I'll cop to being sort of an unstructured sort of guy; most of the things I try don't amount to much. For example, I latched onto taking notes using Freemind (a mind-mapping program) for a time. But I ended up back in a text editor. In a similar vein, I've tried to incorporate some "best practices" from Getting Things Done (such as doing things immediately if they can be done in five minutes). But I've never been much for structured systems as such.

That said, via Guy Kawasaki, here's a great "periodic table" of visualization methods that's well worth checking out. Guy also notes:

Ralph Lengler and Martin J. Eppler created it. You might also enjoy reading their paper, entitled “Towards a Periodic Table of Visualization Methods for Management

Another Search Oddity (this time Google Groups)

Earlier this month, I ran across some apparently unusual search results using Microsoft Live Search. I still think they were peculiar but I wasn't able to pin anything down conclusively. See the posts here and here.

Now I have something odd from Google Groups (i.e. their Usenet archive) in the form of content that used to be there but now appears to be missing. I understand that Google sometimes removes content for copyright and other reasons, but I can't imagine that's the case here. It's just an old post from the newsgroup discussing an old DOS-based program of mine, Directory Freedom.

I was going through some old paper files and I ran across a hardcopy of an old (1996) posting that heaped (rather overly effusive) praise on my program. The hardcopy was a printout of a Google Groups search result. I can't tell the exact date, but there's a 2001 Google copyright notice on the bottom so I'd assume it's from 2001 or not much later. (A scan of the hardcopy is here.) Anyway, just for kicks, I thought I'd pull up the link and save a copy of the page. And I couldn't find it by searching Google Groups.

I'm not sure what to make of this. I first tried searching using some unique terms from the post and didn't come up with anything. Then I reverted to using the same search string as I had originally ("gordon haff") and going through all the results by hand. Still nothing. As far as I can tell, the post is just gone (or at least removed from search results).

Have I missed something obvious? Or is there some other explanation for this result?

[UPDATE- Mystery (mostly) solved thanks to Brandon who tracked down a search string that did work. (See comments.) My name is mis-spelled in the post in question. So why did it show up in my previous search? I can't say for sure but perhaps Google was just returning "looser" search results five years ago when there were a lot fewer sites and posts than there are today.]

Links From Mashup Camp

As I've written about on Illuminata Perspectives, I'm spending most of this week at Mashup University and Mashup Camp at MIT. Here are a few things that particularly caught my eye that I'll be checking out further.

My first link isn't directly connected to "mashups" (see this piece by my colleague Jonathan Eunice for a broader discussion of the whole mashup phenomenon.) However, the principal investigator for the FutureBOSTON competition connected with this event, Tom Piper, has led a lot of interesting work previously. I highly recommend checking out Beyond the Big Dig, a series of case studies about urban open space projects that have many parallels--both good and ill--to the continued wrangling over open space that the Big Dig freed-up in Boston.

If you're into boating, checkout Virgil Zetterlind's It's a boating and fishing blog, but apropos the topic at hand, it's a mashup of NOAA (National Oceanic and Atmospheric Administration) data, such as buoy locations and water depth, with Google Earth. It's still a work-in-progress, but what's already in place is really worth checking out.

I'll have to dig into some of the development tools that I've seen a bit deeper before I can have a real opinion on them, but the Boxely UI Toolkit from AOL for developing rich desktop applications is one interesting possibility. It allows for all sorts of interesting animations and transitions for multimedia content as well as for the basic interface elements. A preview is available for download.

A few other resources that may be worth checking out:

  • - General resource for mashups and Web2.0 APIs
  • - Simplify mashups development (including RSS and screen-scrapes)
  • - The AOL guys were really pushing these as an alternative to heavier-weight XML

From the dot bomb Hall of Fame

This tongue in cheek Etrade Superbowl ad from 2000 is probably the ultimate capture (at least with the benefit of hindsight) of the dot com Superbowl TV ad craziness.

(Their commercial the following year, in the midst of the deflating bubble, captured the zeitgeist of that time pretty well too, but I don't see a copy online.)

(As a side note, was a fun source of entertaining TV ads during the first Internet boom but they (wisely from a business perspective) morphed into a subscription site for the advertising industry. These days, YouTube provides a convenient, if not entirely legal, alternative.)

The Hard(ware) Side of Collaboration

Better collaboration among remote teams and groups is certainly going to require new software and better ways of thinking about the whole collaboration process. But new types are hardware interfaces are also going to be part of the mix. Here's an interesting video of one such potential device (via Stephen Shankland of CNET

Jeff Han is a research scientist for New York University's Courant Institute of Mathematical Sciences. Here, he demonstrates—for the first time publicly—his intuitive, "interface-free," [multipoint] touch-driven computer screen, which can be manipulated intuitively with the fingertips, and responds to varying levels of pressure.

This is cool stuff. There are actually simpler devices that are available for sale today, however, that could be useful for collaboration if they were cheaper, more widely deployed, and (therefore) software were widely available to exploit them.

Take for example Wacom's Cintiq (or a tablet PC). One of the things that I find really difficult about doing some things remotely is that I can't easily do a "napkin sketch." Yes, as my co-workers well know, it's going to be pretty horrid from an artistic perspective. But sometimes there no other comparable way to communicate a quick idea. Sure I can draw something with a mouse or a low-end graphics tablet--or sketch something on paper and scan it--but these are pretty awkward emulations of throwing something up on a whiteboard in a face-to-face meeting. A Cintiq or other form of directly writable monitor seems likely to become a pretty important ingredient in remote collaboration. I expect that the day will come, when such are cheap and ubiquitous, when people will find it hard to imagine that they ever lived without them.

Size Disconnects

I'm a fairly fervent supporter of the thesis that "the market" connects buyers and sellers pretty efficiently. But sometimes there are disconnects that make me scratch my head. Take the "standard" frame sizes and the "standard" print sizes.

Frames and mattes still tend to come in dimensions sized for age-old 8x10", 11x14", and 16x20" sizes--or any of a variety of other sizes that correspond to the photo paper sizes that I purchased over a quarter-decade or so ago. (These, in turn, tend to be oriented to large format 4" x 5" film--which is a bit more square than "35mm" film (24 x 36mm). Technically, 8x10 is a 1.25:1 aspect ratio vs. 35mm's 1.5:1.) Yet, inkjet paper has sizes that are more attuned to some traditional paper standards such as 8.5x11" and 13x19"(SuperB) (in the US). As a result, typical inkjet paper prints don't fit in standard frames. Furthermore, most people tend to think that the more rectangular formats (i.e. longer horizontal or vertical dimensions) are more attractive--all other things being equal. Therefore, it's not simply a case of the old "4x5" dimensions being better

I'd be interested in thoughts on this. But, as a practical matter, it means that I can't fit prints made with off-the-shelf paper in my inkjet printer into a standard frame. Perhaps most frame-able photos still come from more traditional photographic sources but I find this a bit hard to believe given the prevalence of inkjet printers and, at least to my eye, the superior aesthetics of wider aspect-ratio images.

Amazon Music Store in 2007?

Paul Lamere of Sun Labs, the principal investigator for "Search Inside the Music" has an interesting post with his thoughts about the rumors that Amazon may launch a music store shortly. Admittedly, this rumor has been around for a while, but Paul makes a very strong case for why it could be very significant--especially if the music isn't DRM-ed (as seems to be the current speculation).

Of course, much would depend on how many tracks Amazon could get access too; for example, eMusic doesn't have DRM but it's also lacking many of the big music hits--which still represent a lot of volume, Long Tail notwithstanding. However, that aside, Paul makes a good case for how Amazon's "focus on discovery" and metadata-oriented Web services could give them some real advantage. And he's absolutely right. As I've discussed before, rich--and automatically captured--metadata is an absolutely key ingredient filtering and ordering digital data, including music. (Amazon also has the social component down with user ratings, which could further enhance the value of digital albums and tracks.)

Amazon has a great set of web services built around their data. Using Amazon's web services, one can get access to book descriptions, book cover images, reviews, pricing information - just about any piece of data in Amazon's database is exposed via their web services. Exposing their data in this fashion places Amazon at the center of the online literary ecosystem. Any startup company that wants to be in a business related to books will use Amazon's API because it is easy, the data is of high quality and it is free. This is good for the startup, and even better for Amazon since all of those startups end up sending their customers to Amazon. Amazon is already a big part of the music ecosystem. They already have lots of data for music CDs that is available via their web APIs. They are probably the largest supplier of album art on the web. The Amazon part number - the ASIN - is used throughout the web as an unambiguous identifier for an album. Once Amazon starts to sell individual tracks, I would expect that Amazon will create an ASIN or an equivalent for each track in their database. This track-level identifier may become the primary way of identifying tracks in the music world since Amazon makes it so easy to get all of the information about an item once you have the ASIN. This could be a key enabler in the next generation of music - a ubiquitous song ID tied to deep metadata.

Paul concludes that "in 2007 we may see the tipping point for digital music and Amazon may be at the center of it all."

More on Microsoft's Live Search

Over the weekend, I spent some time digging a bit deeper into the seeming anomalies with Microsoft Live search results that I discussed last week.

My original intent was to take a purely quantitative approach. I planned to search using the names of a few people who I knew would have quite a few articles, blog posts, and/or quotes that would mention Microsoft--as well as many that didn't. And then I would simply summarize the results of those searches run using Live, Google, and Ask, and thereby see if there was a pattern of differences--and specifically if "Microsoft" continued to appear more frequently in the Live results than in the others.

This methodology turned out to be problematic. One problem is that many of the results are "dynamic content." For example a CNET page might list a changing set of stories on the side of the page--which could include stories with a Microsoft-related headline. Who knows what specific contents were present when the page was spidered by a given search engine. A similar issue occurs with dynamic blogrolls. In addition, some searches pointed to the main page for a blog--which has many posts (and therefore is much more likely to contain any given search term). It doesn't seem that this should really be "scored" the same way as a search result that returns a specific post. Finally, not all the search results were relevant (i.e. some pointed to different people); the engines varied in this respect, which would have added further noise to data based solely on counting occurrences of "Microsoft." Thus, I went back to a more impressionistic approach.

My results probably are still the oddest. I discussed the image search in last week's posting. In Web search, it's the two top results that are the real outliers. Microsoft is prominently mentioned in the #1 and #2 results for Gordon Haff: & These are individual posts and neither is returned anywhere near the top by either Google or Ask. Other Live results that include mention of Microsoft (e.g. and email-collaboration.html) are also returned by at least one other search engine.

I also ran searches using a couple other names. A search on Stephen Shankland (all searches were run at about 5PM on January 6) also returned Microsoft in the top two results--including a headline: and However, Google also returned a number of high-ranked hits that included "Microsoft" (starting in the #3 position).

My conclusion after relooking at the results I described last week as augmented by these other search results? There are still puzzling oddities although nothing that I would call a smoking gun that proves systematic bias toward search results containing "Microsoft." The results produced by the search on my own name continue to be particularly hard to explain, given that they point to specific posts about Microsoft that are not highly ranked by any other search engine that I examined. However, it's also true that my admittedly limited testing using other names didn't turn up other examples that were anywhere near as compelling. I'll be interested to see if anything else turns up on this, but for now I'm just moving it to the "that's weird" bucket.

Five Things You (May) Not Know About Me

OK. I've been tagged for the blog meme/chain letter of the moment. I try to assiduously avoid such things. And I definitely don't have any photos as amusing as a very fresh-faced Catherine with Ozzy Osbourne. Nonetheless, for you Catherine, here we go. "Five Things You Don't Know About Me." (Most of which shouldn't be particularly surprising to anyone who has really dived into my website.)

  1. Out of engineering school, I designed offshore drilling rigs. In addition to spending a lot of time in shipyards and flying around in helicopters, I worked on various mechanical subsystems (including piping systems such as ballast control) on the Ocean Odyssey which is now an equatorial satellite launch platform.
  2. During c. the nineties, I spent much of my free time developing DOS software (shareware and freeware). Although I wrote a variety of things, I spent most of my time on a file manager called "Directory Freedom" that grew out of a couple of PC Magazine utilities and some subsequent work by Peter Esherick of Sandia Labs. It clocked in at a whopping 39KB because it was written in x86 assembler. (Shudder....) I sometimes get emails from people who (still!) use it--which is nice.
  3. When I was at Dartmouth, I helped found The Dartmouth Review, an alternative (conservative) off-campus newspaper that recently celebrated its 25th anniversary. I also created the last word quotes column; at one point I tried to get a compilation published as a book, but didn't find a publisher.
  4. I've climbed 20K ft.-plus peaks including Island Peak in Nepal and Cotopaxi in Ecuador. I haven't done this recently because I've been recuperating from a broken foot that I suffered in the Grand Canyon.
  5. And (most obscure, but probably least interesting), I grew up right next to a Nike Hercules base West of Philadelphia (very near where Unisys is today--Burroughs at the time) in an old farm house. (Which should have keyed me into the issues associated with old farm houses such as the one in which I'm now living--but I digress.) This was the PH-82 site--later purchased by the University of Pennsylvania and now a housing development.

Microsoft Live Boosting "Microsoft" in Searches?

Google has been taking some heat recently. As Blake Ross notes: "Google is now displaying “tips” that point searchers to Google Calendar, Blogger and Picasa for any search phrase that includes “calendar” (e.g. Yahoo calendar), “blog” and “photo sharing,” respectively." These tips aren't part of the search results themselves, but they appear above the search results and therefore arguably even more prominent than the #1 search result.

Google has also been criticized of late for advertising its own products using AdWords. Although it makes its ad buys using the exact same mechanism as anyone else, a lot of people still feel that it's not a level playing field if Google is simply taking money from one pocket to put it in another.

However, none of this involves "Google-biasing" the search results themselves. Contrariwise, I had a very interesting experience over on Microsoft Live image search earlier today. I searched for my name (Gordon Haff) and the top result was... unexpected.

The first result is indeed associated with me; it's an ad that I used to illustrate this Illuminata Perspectives entry. What is manifestly curious however is that this posting just so happened to be about Microsoft!

Was it perhaps because this particular posting of mine was so overwhelmingly popular that it just percolated to the top of the search rankings? I think that unlikely. I can go for page after page of Google Image search without this image appearing. Ditto for

Did the presence of "Microsoft" and/or other keywords boost this particular result way up in the standings? It certainly seems so. Which would be a big no-no indeed if you're ostensibly providing unbiased search results.

[UPDATE: In regular web searches on Microsoft Live, my Blogger profile is the #1 entry. However, the #2 and #3 entries in a Gordon Haff search also feature "Microsoft" prominently even though they are nowhere near the top in other search engines.]

[UPDATE #2: Google Tips are apparently gone, at least for now.]