- How Land Rover, England's Ugliest Station Wagon, Became One of the World's Most Luxurious Brands | Adweek
- If We Still Used Punch Cards - YouTube - RT @LivingComputers: In this video @GHaff uses concise visuals to demonstrate punched card obsolescence:
- Twitter / ghaff: Hmm. Bing's running this Bing ... - Hmm. Bing's running this Bing vs. Google web search contest. Were smoked in my (blind) trial.
- The 5th Tenet of Open Hybrid Cloud: Start With an IaaS Private Cloud | tentenet.net - An under-the-covers look at #redhat technologies for hybrid IaaS by @bryanwche
- Mary Meeker’s 2013 internet trends: all the slides plus highlights - Quartz
- Doc Searls Weblog · Let’s help Airbnb rebuild the bridge it just burned - Not to defend Airbnb here, but I suspect we're going to see a lot of stress points as sharing services try to move out of the margins to something more mainstream.
- Forecast 2013 Registration - RT @opendatacenter: Enterprise tends to turn to private #cloud, but will it stay that way? @ghaff will discuss on a panel at #Forecast13:
- Why Do Not Track is destined to fail (DNT is DOA) | getwired.com - "Rather than driving efforts like DNT, which fundamentally cannot occur (in the manner users think those words mean “do not track”), we’d do a lot better as an industry to drive standards that delineate what types of information a specific site or tracking engine like Google Analytics or Adobe’s Omniture products can collect on you. But even if you throw those back at users, they’ll be overwhelmed. Perhaps the best angle is to reinforce that no activity on the Internet is totally anonymous, and no matter how hard you try, you cannot ever completely prevent being tracked."
- On the Internet, Nobody Knows You're a Dog
- NetAppVoice: Securing The Cloud: Why You Need Cast-Iron Guarantees - Forbes
- Data Science eBook by Analyticbridge - 2nd Edition - Data Science Central
- Memo to this year’s YC class: It’s damn hard to build an enterprise company | PandoDaily - RT @utollwi: Memo to this year’s YC class: It’s damn hard to build an enterprise company #SaaS
- http://maps.bpl.org's photos. - RT @flickrock: @ghaff Check out that Flick gallery using Flickrock!
- Flickr: Norman B. Leventhal Map Center at the BPL's Photostream - Cool map resource on Flickr from the Boston Public Library
- Google IO — Benedict Evans - "My main impression of Google IO was not so much any specific announcement as the overwhelming sense of ambition and self-confidence."
Friday, May 31, 2013
Links for 05-31-2013
Thursday, May 23, 2013
Data in, hunches out
"Going with your gut is out." That line, from Russell Reynolds Associates managing director Shawn Banerji, neatly summed up a big chunk of yesterday's MIT Sloan CIO Symposium. There was discussion of the computing side of things as well. I especially liked EMC CTO John Roese's description of public clouds evolving to a "chaotic" (as in heterogeneous/hybrid) pool of resources in which special purpose clouds would have specific functions. But the majority of the day revolved around data.
Not necessarily "big data" by the way. One panelist—Jack Norris of MapR—even remarked that the "big data term is probably short lived." But, rather, the pervasive use of data, in whatever form and at whatever speed, to drive decision-making. As Annabelle Bexiga, the CIO of financial services firm TIAA-CREF put it: "Big data is just a richer set of data. [It's a] natural evolution of where data is going."
Erik Brynjolfsson (above) is the Director of the MIT Center for Digital Business. He spoke to how the data explosion was a revolution of technology but was also (and required) a revolution in management.
He offered the example of publishing, described as historically a "culture of lunches"—which is to say a culture of hunches and people networks. But Amazon brought a culture of numbers to the industry. And things haven't been the same since.
A 2011 paper, which Brynjolfsson co-authored with Lorin Hitt and Heekyung Hellen Kim found that data-driven decision making at firms resulted in 5 percent higher productivity than at firms which weren't so data oriented.
Brynjolfsson also discussed what might be called ambient data, data collected almost incidentally from sources such as cell phone records. He offered the example of streetbump in Boston, which uses an iPhone app to find potholes. At the same time, he observed that the app did best at finding potholes in the Back Bay, Beacon Hill, and other relatively upscale Boston locales. Why? Because that's where iPhones predominate. As Sloan's Andrew McAfee would elaborate on in his closing keynote (paraphrasing science fiction author William Gibson), "the future is already here. It's just unevenly distributed."
The panel which Brynjolfsson moderated also touched on some of the privacy issues associated with this ambient data. The MIT Media Lab's Sandy Pentland told how a big data commons, created by French telco Orange, was used to reduce commuting times in the Ivory Coast by 10 percent by rearranging bus routes using location-based data from mobile phones. Pentland went on to note, with more of a bit of understatement, that this sort of thing is politically controversial in places like the UK.
For his closing keynote, Andrew McAfee (above) took as his jumping off point a fascinating graph that appears in Ian Morris' Why the West Rules—For Now. At the risk of offering up spoilers, the central thesis of the book is that, viewed from the perspective of today, the level of worldwide social development prior to the industrial revolution is effectively in the noise.
(Although McAfee didn't get into this, the answer to the book title's question is basically that the West was better positioned to create and take advantage of the industrial revolution when the factors making it possible came together. It's a great read. If nothing else, it's a good history of the world from the perspective of the Western and Eastern core.)
The industrial revolution had such an impact because it overcame the limitations of human muscles. (I suppose, given farm animals, it would be more accurate to say it overcame the limits of human and domesticated mammal muscles generally.) In any case, though, McAfee's thesis that that today we're starting to overcome the limitations of our individual minds.
He laid out four elements to this:
Cyborgs—as in new combinations of people and machines.
Open—which will define successful organizations in a variety of ways. (If you want to delve more deeply into this thought, I point you to a presentation I gave at ProductCamp Boston a few weeks back.)
Data-driven—because for the first time we have data-driven visibility in all sectors of the economy.
Evolving—for which McAfee offered the example of the car rental industry which evolved only incrementally since it was founded after the Second World War but has seen the introduction of radically new services made possible by the Web and mobile phones from Zipcar to Lyft.
So it's more than data. But data—along with the compute needed to operate on it and the networks needed to move things around and tie them together—is a common thread. Big challenges lie in gaining access to the right data, even within single organizations. Cutting across data silos was also a theme heard more than once throughout the day. Asking the right questions matters too. As McKinsey's Michael Chui summed up that thought: "Be data driven. But don't suck at it."
Tuesday, May 21, 2013
Links for 05-21-2013
- Photographer Shares His Lightning Quick Lightroom Workflow
- Python 3 Metaprogramming - YouTube
- Has Intel finally landed that elusive Atom deal? | ITworld - RT @apatrizio: My debut blog with ITworld on the chips biz: rumors of an #Atom-powered #Samsung #Galaxy tab.
- New Orleans Buck
- (403) http://blogs.forrester.com/james_staten/13-05-16-hybrid_cloud_future_too_late - RT @Staten7: Hybrid Cloud? You are already hybrid. The question: What are you doing about it? #Forr blog: @stefanried
- Twitter / krishnan: Best definition of PaaS I have ... - RT @sravish: HILARIOUS (and perhaps true?) -> “@krishnan: Best definition of PaaS I have come across - ”
- Making Big Data Technologies Work in the Enterprise - “A lot of the architectures and products that technology managers may have been accustomed to for traditional transactional activity don’t map well to a big-data world,” says Gordon Haff, corporate technology evangelist at Red Hat and the author of Computing Next, a book on cloud computing. “You very much need to think about an architecture in the context of big data.”
- Ten years on: How did that cloud strategy pan out? • The Register
- www.martinlamonica.com/wp-content/uploads/2013/05/Making-Big-Data-Technologies-Work-in-the-Enterprise.pdf - Nice piece on big data techs for enterprise by @mlamonica (w quotes from me)
- Netflix, Reed Hastings Survive Missteps to Join Silicon Valley's Elite - Businessweek
- 'Weeds' Creator Kohan Dishes on How Netflix Drives Hollywood Insane - Businessweek
- The Serious Superficiality of The Great Gatsby : The New Yorker - "Baz Luhrmann’s “The Great Gatsby” is lurid, shallow, glamorous, trashy, tasteless, seductive, sentimental, aloof, and artificial. It’s an excellent adaptation, in other words, of F. Scott Fitzgerald’s melodramatic American classic. Luhrmann, as expected, has turned “Gatsby” into a theme-park ride. But he’s done it in exactly the right way. He hasn’t tried to make the novel more respectable, intellectual, or realistic. Instead, he’s taken “The Great Gatsby” very seriously just as it is."
- Twitter / Caterina: LinkedIn says I could apply ... - RT @Caterina: LinkedIn says I could apply to be Sr. Product Manager of Mobile for Flickr! Plus some other jobs. Woo!
Thursday, May 16, 2013
My first MOOC (Massively Open Online Course)
I recently completed my first Massively Online Online Course (MOOC), a term that presumably is at least passingly inspired by MMORGs, an online gaming genre that's most popularly represented by World of Warcraft.
The class was on Gamification and was well-taught by Wharton prof Kevin Werbach. But my focus here isn't to review or critique this particular class but, rather, to offer more general reactions to the instructional method. To reflect on what these courses seem to do well or at least handle relatively naturally, and what they struggle at. It's just a sample size of one—well, 1.5 actually as I'm currently taking a Data Science course that offers some additional insights—but I think it nonetheless exposes certain patterns.
I also encourage those interested in the topic to read Nathan Heller's "Is College Moving Online?" in the New Yorker, a thorough examination of the state of MOOCs and their potential effects on education—both for good and ill.
The format
The format for Gamification—which it seems is fairly typical—is built around a series of lectures. These consist of fairly typical Powerpoint slides with video of the instructor superimposed or off to the side in a small window. The production values are generally high and the combination of slideware and video is engaging.
There's a syllabus with links to various articles and other (free) materials. Prof. Werbach actually has a book on the topic of Gamification but it wasn't required for the course. None of the Coursera courses I've taken a look at had much if any in the way of stuff to buy.
The course then had a series of multiple-choice quizzes and a multiple-choice final exam—plus three written assignments of increasing length and scoring weight. My current Data Science course likewise has a series of lectures. But, in this case, the score comes from a series of programming and other assignments that relate to lecture topics although they are more hands-on and practical.
Type of schedule
Gamification, like the course I'm currently taking, comes from Coursera, which has a large course catalog from a wide range of schools. It was started by two Stanford professors and has received $16 million in funding from Kleiner Perkins Caufield & Byers.
Coursera's model, like that of the non-profit edX, is to offer classes in more of less real-time—by which I mean, an eight week course has a fixed start date followed by an end date about eight weeks later. Depending on the class, there may be more or less flexibility in how closely students have to hew to a weekly schedule for assignments, quizzes, and the like. But, fundamentally, the class is on a calendar and you can't dip in and out on the basis of work or family obligations, travel schedules, and inclination.
The downsides to this approach are obvious. There are a couple of classes I've considered but opted against because they overlapped periods when I would't have been able to devote much time to them.
On the other hand, after taking a class, I better appreciate why one might want to run a class to a schedule. Discussion boards, assignments (especially peer-graded ones—more on those in a bit), the availability of staff to answer course questions or address problems, and just getting forced into the "flow" of a class all require or at least greatly benefit from a schedule and associated incentive structure. (Education has a lot in common with a gamification system.)
In a related vein, I also better understand why you might not want to break a course into overly granular chunks, i.e. one or two week classes. I'm not just talking about any particular administrative overheads associated with putting a course in a catalog, but a broader set of transaction costs borne by all the participating actors such as figuring out how the class is run and understanding or dealing with prerequisites or tools. In many cases, it seems these would collectively just add too much overhead to a short course unless that course were more intensive than most people with a full-time job could undertake.
Does the fact that these Coursera courses so reflect the form and content of traditional university classes simply reflect tradition and the fact that much of the content was originally developed for such classes? Perhaps to a degree. On the other hand, whatever issues higher education may have today, it's also likely that not everything about traditional class instruction is wrong.
Another form of MOOC largely goes self-paced, even if related lectures still largely parallel the content of a semester of classes. Machine-graded quizzes can still exist, as can other types of computer- or self-evaluated assignments. Udacity, another VC-funded MOOC, follows this model today.
Other types of online instruction that increasingly diverge from a true MOOC model include iTunes University and the many instructional videos on YouTube. More focused sites, such as Code Academy, teach specific skills.
I think both more- and less-structured forms will have their place. Many of us have a limited ability to schedule courses that follow a fixed schedule and find the flexibility of watch-when-you-can attractive. At the same time, I appreciate the relative discipline and other potential benefits provided by a more formal course structure.
Grading and certification
Coursera, like edX, follows another aspect of most traditional university courses; it grades you. In Coursera's case, this takes the form of a Certificate of Completion based on your performance in various assignments and quizzes as determined by the individual class—70 percent in the case of Gamification. (Coursera is also starting to offer a "Verified" version for certain classes.)
I suspect that, at this point, a Coursera certificate is more of a gamification element, i.e. a motivator, than something that's especially useful outside of Coursera. However, it's also true that Coursera's business model will ultimately depend on being able to sell the ability to gain meaningful certifications—which means that they need to be able to grade.
Schools have, of course, been grading forever. But remember what the "M" in MOOC stands for? It's "massive," indicating that it's not at all unusual to have 50,000 people sign up for a MOOC. (Although far fewer will complete it.) It's obviously not practical for a professor and a few TAs to grade at that scale. In the future, I wouldn't be surprised to see hybrid models in which something "MOOC-like" is augmented with one-on-one and one-to-few interaction with professors and TAs for a fee—indeed we're starting to see examples of such—but let's stick to the subject at hand for today, given that what's actually being paid for starts to become a complicated question.
A couple observations about the state of grading in MOOCs.
Multiple choice works well, subject to the limitations of multiple choice. Given well-written questions, there's no more ambiguity than with any other multiple-choice exam. It's easy and instantaneously graded by computer. And modest feedback can be made available with respect to the answers.
Computers also do well at grading freeform but well-bounded and unambiguous answers, such as a numerical solution with only one correct response. It's "42" or it's wrong.
However, start talking more open-ended problems, even in quantitative topics like programming, and automated graders can start failing answers in unexpected ways. Without going into all the gory details, the autograder for the first assignment in my current data science course was highly sensitive to, for example, how the programmer chose to parse text fields and to any quirks in the output's format. While not fatal flaws, it's indicative of how automated grading challenges magnify exponentially the more creativity is allowed in solving an open-ended problem.
Also problematic is peer review. On the one hand, this offers a way for massive scale evaluation of freeform text in a way that it's hard for imagine computers tackling anytime soon. On the other hand, across a huge class with students of all ages, skills, language abilities, and interest, it's not hard to imagine that the evaluations can be… quirky—even given a relatively detailed scoring rubric.
I didn't personally have a huge problem with my results. But it was obvious from the discussion boards that a lot of people took low evaluations made without comment and evaluations made with obvious disregard for the scoring instructions very personally. And it's worth observing that, given a large class, statistics suggests that some will have simple "bad luck" with those they draw to evaluate their written assignments—even given multiple graders, multiple assignments, and algorithms to screen bad actors.
At the current stage of MOOCs, I personally find it easy enough to shrug my shoulders and get on with things. I have lots of diploma and certificate things gathering dust somewhere. But say a class has a written assignment contributing say, 33 percent of the grade. To the degree this grade has meaningful consequences for the student (for a resume or for tuition reimbursement if MOOCs start charging in some form), that's a problem. It's also fair to say that, real world significance aside, grades can have a motivating factor for many as well.
This hasn't been intended as an exhaustive look at the "grading issue" but it's been evident to me it's something that will have to be at least improved on—not that students are ever wholly happy about their grades—as MOOCs look to start collecting money from people and offering meaningful certifications.
Overall
I got a lot out of this course and it looks as if there's a lot of quality content out there—more certainly than I'll have time to dig into. I'm also happy to see both for-profit and non-profit initiatives probing at ways to make higher-education better and more efficient.
What I don't have a real opinion on is what the effect of Coursera, edX, and their ilk will be. The effect will certainly be uneven although one wonders if some aspects of MOOCs can replace elements of classes even at elite institutions. (Though I'd note that we've had the technology to replace 500 student freshman lectures for at least a decade.) I do suspect, or at least hope, that MOOCs—or at least MOOC methodologies—can replace low-value-add broadcast education in many situations.
One of the reasons that this particular crystal ball is cloudy is that higher education is often this odd hybrid of credentials, socialization, and learning that can be impersonal, highly personalized, solitary, involving lots of peer interaction, or some combination all of those. MOOCs clearly don't address all those modes but it arguably can do a subset rather well.
Monday, May 13, 2013
Links for 05-13-2013
- The Guerilla Guide to R
- Nathan Heller: Is College Moving Online? : The New Yorker
- Head First SQL: Hands On
- The Numbers News - Analysis: The Most Popular Character Names in Movies
- Good description of python and unicode
- Design of Computer Programs: Programming Principles - Udacity
- SQL Tutorial
- Data Science Toolkit
- Obama Signs Open Data Executive Order: U.S. Government Data To Be Made Freely Available - Forbes - "The order states that “going forward, newly generated government data shall be made freely available in open, machine-readable formats, while appropriately safeguarding privacy, confidentiality, and security. This requirement will help the Federal government achieve the goal of making troves of previously inaccessible or unmanageable data easily available to entrepreneurs, innovators, researchers, and others who can use those data to generate new products and services, build businesses, and create jobs.”"
- Project Open Data - "Data is a valuable national resource and a strategic asset to the U.S. Government, its partners, and the public. Managing this data as an asset and making it available, discoverable, and usable – in a word, open – not only strengthens our democracy and promotes efficiency and effectiveness in government, but also has the potential to create economic opportunity and improve citizens’ quality of life."
- Mechanical MOOC
- How to hire data scientists and get hired as one — Tech News and Analysis
- Data journalism | Media | The Guardian
- Moleskine logo contest dubbed 'Molescheme' by angry designers - mUmBRELLA
- GeoJSON and KML data for the United States
Podcast: Kinvey's Sravish Sridhar on Backend-as-a-service
Listen to MP3 (0:16:26)
Listen to OGG (0:16:26)
Tuesday, May 07, 2013
Links for 05-07-2013
- The Harold and Maude Project Home Page
- Cheating to Learn: How a UCLA professor gamed a game theory midterm | Which Way L.A.?
- Course Catalog for Free Online Classes - Udacity
- Stanford Online
- (404) http://t.co/XkxdxR8I - RT @opensourceway: Since it is conference season: Writing an Excellent Post-Event Wrap Up Report | Hawthorn Landings | …
- Solving Equation of a Hit Film Script, With Data - NYTimes.com
- 50 Years of Stupid Grammar Advice - The Chronicle Review - The Chronicle of Higher Education - "Notice what I am objecting to is not the style advice in Elements, which might best be described the way The Hitchhiker's Guide to the Galaxy describes Earth: mostly harmless. Some of the recommendations are vapid, like "Be clear" (how could one disagree?). Some are tautologous, like "Do not explain too much." (Explaining too much means explaining more than you should, so of course you shouldn't.) Many are useless, like "Omit needless words." (The students who know which words are needless don't need the instruction.) Even so, it doesn't hurt to lay such well-meant maxims before novice writers."
- jsonviewer.net
- a beginners guide to streamed data from Twitter (tecznotes)
- English and Dravidian: Unlikely parallels | The Economist - RT @TheEconomist: Pairings of languages, like Old English and French, have created fascinatingly hybrid languages around the world
- | data.gov.be
- Where to Find Open Data on the Web – ReadWrite
- (184/1) Data: Where can I find large datasets open to the public? - Quora
- Many Eyes
- Public Data Sets : Amazon Web Services
- HEXAWE | multi boss | piggy tracker | net audio | label & radio
- Behind the Data: Max Shron of OkCupid | visualizing.org
- IDC: Virtualization's March To Cloud Threatens VMware – ReadWrite
How Open is Eating the World (and what it means for marketing)
I gave this presentation at ProductCamp Boston last weekend. I confess to this not being an especially self-documenting deck. I'll try to get a video or narrated version up at some point. You can also download the PDF.
Friday, May 03, 2013
Book Review: Vintage Tomorrows
You probably know O'Reilly for their programming books. However, they also publish books in a variety on variously geeky themes—a number of which I've rather enjoyed. So I readily accepted their PR agency's offer to review a copy of Vintage Tomorrows: A Historian and a Futurist Journey Through Steampunk Into the Future of Technology. One author, James H. Carrott, is a freelance historian and former Xbox 360 hardware product manager. The other, Brian David Johnson, is a futurist at Intel.
The steampunk themes explored in the book that resonated the most with me were those of exploration and making.
The Victorian era—steampunk's nominal origin and venue—was a time of great scientific exploration and wonderment, when things like carbon arc lights were still marvels.
To light the sea underwater, there is a strong light or 'ships lantern' in an exterior enclosure at the aft end of the top deck. The enclosure is tall enough for Nemo to rest his elbows on it while he gazes on the surface of the ocean (perhaps 1.2 m or 47 inches) and the sides must be pretty nearly vertical, also. The windows are Fresnel lenses with annular rings, like windows in lighthouses. Fresnel lenses can be circular, square, or cylindrical, surrounding a light, so Verne may have had any of these in mind. This light illuminates the sea all around the Nautilus so it is more a flood light than a search light with a narrow beam. The light source is an electrical carbon arc in a vacuum, with graphite points. The exterior light or lantern of the Nautilus combines the best technology of Verne's day.
Claire Hummel says in the book that "I really love the Victorian sense of exploration, never giving up on exploring new things and new worlds. We have covered most of the planet but we still discover new deep sea creatures or go into outer space. That's why I like steampunk—at it's core it's about discovery and wonder."
According the authors of Vintage Tomorrows—and their many interviewees—steampunk is also a celebration of making, the antithesis to mass-produced, featureless goods. As Cory Doctorow puts it in an interview: "Technology should be tinkerable." This dovetails into ideas such as individuality, the ability to change, and the control over technology.
As Carrott notes:
There was a time when you could take apart devices. A pocket watch is just one example. People took apart things like rotary phones, transistor radios and cigarette lighters. An ordinary person could take one of these apart, understand how it worked, and maybe even put it back together! Empowering, right? You were smarter than the device. You understood how it worked.
The book features lots of interesting interviews, rumination, and dinner party conversations around steampunk and vaguely related topics. I'm sure that I (and many of you) would have enjoyed being guests at those dinners and other events.
That said, the prologue warns that "We couldn't tell this story in a traditional manner. It literally defied our every attempt. So we gave in and let it lead us."
In fact, the book is not really a narrative on the topic as you'd probably expect, but more of a journal or memoir about writing the book and filming the associated documentary. It doesn't really have a narrative flow as such.
I also confess to finding that the style of Carrot's writing in particular—most of the book explicitly separates the voices of the two authors—often seemed to be about making interviews as much about himself as his subject:
"Well, there's something going on," he [science fiction author William Gibson] agreed. "There's something wider going on culturally that I don't identify with steampunk, but I think steampunk might be another slightly more exotic symptom of it."
I couldn't suppress a grin. Bill-freaking-Gibson (I know that's not his middle name) had just, without prompting or a direct question, affirmed the suspicions I'd voiced from the story of this project. There's something bigger going on. And that's what this chapter is really about—the something bigger.
Far from being unique, the above text typifies much of the book. It's not a case of an author occasionally personalizing some experience. It's about constant interjection.
Furthermore, the book makes frequent references to works such as Gibson's The Difference Engine, a book which the index tells me is mentioned on eight different pages. Yet, although it also profusely quotes Cory Doctorow's introduction to a 2010 edition of The Difference Engine, it nowhere really explains what it is about this work that makes it worthy of so much attention in a book about steampunk and how it relates to the various steampunk themes discussed throughout.
Such is probably the nature of interviews; interviewees don't always provide a lot of context for what is, to them, well-plowed ground. But that's the value of wrapping interviews with additional narrative and background. Of which there's too little here.
Ultimately, this book contains plenty of interesting—even fascinating—primary source material. And that may well be enough for someone with a strong interest in the topic at hand. It was (barely) for me. But I can't help feeling that Vintage Tomorrows succeeds better as source material for a book about a "journey through steampunk into the future of technology" than it is at being that book.