Back in April, I sat down with Jon Udell to record a podcast about visualization, lightweight data exploration, and the future of Excel. This was the start of a weekly series of podcasts that Jon has been holding with technology innovators.
Recently, Jon has been posting transcripts of these interviews created using the excellent CastingWords transcription service. This is an interesting experiment. We’ve recently been thinking a key problem with data is not so much accessibility, but usability. Most business folks can access more data than they can handle, but finding insights in the data or transforming the data into a different usable form is really hard. One solution is tools that can take a stream of data and multicast it into multiple different usable forms, like mp3 and the printed word.
Having these transcripts available lets me explore different Chris-centered hypotheses around Jon’s podcasts. For instance, am I funnier than Jon’s average guest? Did Jon Udell talk more than usual during our conversation? Answers: I’m slightly funnier than the average guest (4 laughs during our conversation), but nowhere near as funny as Gary McGraw (19 laughs). Yes, Jon did have a lot to say during our conversation about data visualization—he spoke about 1/3 of the words, which is higher than his average.
Here’s a table showing some information about Jon’s podcasts.
|A conversation with...||Words spoken (Jon is red)|
|Frank Martinez about governance and tolerance||||||||||||||||||||||||||||||
|Andy Singleton about building global teams|||||||||||||||||||||||||||||||||||||||
|Anders Hejlsberg about the May 06 preview of LINQ||||||||||||||||||||||||||||||||
|Harnessing collective intelligence: Nathan McFarland and Benjamin Hill||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|Kingsley Idehen about open source Virtuoso||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|Chris Gemignani about data analysis and visualization||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|Steve Burbeck about multicellular computing|||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|Gary McGraw about software security||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|Peter Rodgers about the 1060 NetKernel|||||||||||||||||||||||||||||||||
The bar charts in this table were created using a variation of the REPR tip that I’ve discussed as a lightweight data exploration method for spreadsheets. The tip involves repeating the bar character "|" a certain number of times.
This approach also works great in HTML. In the chart above, I plot a "|" for each 100 words that a person speaks. The bars are formatted to 8 point Times New Roman and colorized using a span element. A title in the span element also gives you the ability to hover over any bar and see who was speaking.
The table was generated by pulling down the interview transcripts and parsing them in Python. Here’s what the html looks like for one of the bar graphs:
<td style=’font-family: Times New Roman; font-size: 8pt’>
<span style=’color:red’ title=’Jon’>||||||||||</span>
<span style=’color:blue’ title=’Peter’>||||||||||||||||||||</span>