Wordtree for Visual Text Exploration

Analytics can be all about having the right tool for the job. When your data is text, traditional analysis tools (e.g. Excel, OLAP tools) are like peeling a mango with a chainsaw.

There are a number of visual exploration tools specifically designed for text data, including:

  • Word clouds like Wordle (fun but superficial);
  • Network diagrams like Visual Thesaurus (good for individual words, not text);
  • Trend graphs like Baby Name Voyager or Google Trends;
  • Granular presentations for interacting and exploring individual phrases, e.g. We Feel Fine and Twistori
  • "Word trees" that let you navigate through lines of text to understand the most frequent words, relationships between words, and common phase and sentence structures.

It is quite difficult to find a Word Tree in the wild. The brilliant team at IBM's Many Eyes were the first to make Word Tree's generally available. The same ManyEyes team have also created an alternative approach for visual text exploration with a tool called Phrase Net.

Phrase Net

Recently, we built a slightly different take on the Word Tree in Concentrate, our tool which allows users to explore huge search query lists to see how people use search keywords. For geeky entertainment, we created a special Concentrate demo account with the lyrics of songs from Rolling Stone's 500 Greatest Songs of All Time. Click here to sign-in to the demo (Press submit and then choose WordTree at the top).

Here's how our Word Tree works:

  • The box at the center is your starting point. When you open a Word Tree, it will contain the most common word in the text data. You can edit this box to "re-center" the wordtree (name that tune):

Wordtree image

!3a3b5ad0007dd007fb1d58b8bc0dc8a8!

Wordtree image

!c300fae3ca152b92c44e8acbe82361b6!

Wordtree image

!9d66eb50eab6cff03f50d5dcd3ff700c!

While these more advanced visualizations are a start, I suspect there is a lot of room for other tools and techniques to visually explore text data. I'd be curious to hear about other tools you've seen along these lines.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

4 comments


May 26, 2009
Jen said:

I had alot of fun playing with the masters.org word trees this year: http://www.masters.com/en_US/visualization/index.html It did feel very linear though ... seems like there needed to be more ways to explore. OTOH, since I truly was just playing without an objective in mind, it's tough to say what I would "need".


May 26, 2009
tim said:

You may already be aware of htese, but just in case:

www.notcot.com/archives/2008/04/stefanie_posave.php

neoformix.com/2008/StephaniePosavec.html

neoformix has lots of other text analyses scattered through the archives.


June 4, 2009
Aseem said:

One of the things that would be cool is to be able to color code the words in terms of an event (orders, getting to page x, email capture, etc) that way you could look to see/create new high conversion phrases - of course u could end up with really dumb combinations but it would be interesting.


June 4, 2009
Zach said:

Aseem, our wordtree actually does that -- it just requires that you have that additional metric for each word/phrase/sentence as you suggest.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment