Bubble, bubble toil and trouble

Recently we wanted to show how Concentrate, our new long-tail search analytics tool, could give you a view of search patterns across travel websites. As political junkies, we were inspired by this chart from our friends at the NY Times.

NY Times candidate word bubble chart

The first tool we tried, simply on principle, was Excel 2003. As expected, making a NY Times quality bubble chart in Excel 2003 is a hard problem. Here's a draft of how far I got before giving in to label fatigue.

Excel NY Times bubble

The bubbles themselves aren't tough, but getting the labels right is hard. I'd love to see a solution, so if any reader wants to tackle it eternal fame can be yours. Here is a CSV if you want to try.

travelpatterns.csv

Another of the tools we use at Juice is NodeBox, which we used to make this:

Concentrate pattern comparison

Here's the code that made the graph.

The power of a programmatic approach like this is that by changing a line or two, you can get the following. Click for a larger version. Click the text for the code..

With great power comes a great need to exercise restraint. Otherwise you end up like these poor chaps. Must... flex... restraint... muscles...

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

17 comments | Show all comments only the last 5 are shown


January 16, 2009
chip said:

Rob Bovey has an xy chart labeler that may have helped on the original Excel version. I use it a lot and it provides a good degree of flexibility on placement.

http://www.appspro.com/Utilities/Utilities.htm

The labels are not dynamic which is a drawback. It works on other types of charts too.


January 18, 2009
Andy Cotgreave said:

Hi Clint,
Yes, I did initially add the text. However, in Tableau it somewhat overwhelmend the circles. I did try to format the text to grey and shrink it, but the text only served to confuse things.


January 19, 2009
Chandoo said:

Hi Chris,

Good stuff...

I have tried the same in excel while keeping the labels right (I guess so). You can take a look at the chart and downloadable excel here: http://chandoo.org/wp/2009/01/19/excel-bubble-chart/

Let me know your comments


February 9, 2009
David Franta said:

Didn't really find another place to post this, but interesting article posted by Cringely (ZDnet fame) about how JP Morgan mangled a bubble chart recently -

http://blog.cringelysmortgage.com/2009/01/29/whats-wrong-with-wall-street/


February 22, 2009
Mike Chelen said:

How about using the Google Charts API scatter plot? http://code.google.com/apis/chart/types.html#scatter_plot
It allows variable bubble sizes, and has been used in some similar charts such as http://www.xefer.com/twitter

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Introducing Concentrate for Long Tail Search Analytics

We are pleased thrilled to introduce Concentrate™, an innovative long-tail search analytics tool. Concentrate is for SEO and paid search professionals who want to make sense of search keyword data and make the most of search investments.

Check out the demo here. Or try out the free version here (you’ll need admin access to a Google Analytics account).

We built Concentrate because we saw a fundamental conflict in the world of search analysis: On the one hand, search keyword data is terrifically interesting and valuable. It can tell you what your visitors and customers want and how they think about you and your products.

Juice Analytics keywords

Unfortunately, search query data is also big, messy, and hard to get your hands around. In a typical month, the Juice site gets over 10,000 visits from over 7,000 unique keywords.

Even if I could somehow wrap my head around our top 100 keywords, I’d only understand 25% of the visits. For people spending money on search engine optimization or paid search campaigns, that’s a big blind-spot to accept.

We want you to understand and act on all your search data. Concentrate ingests data from sources that most sites already have available (e.g Google Analytics, Omniture, Coremetrics, Hitwise, Compete, etc.), enhances this data by finding common patterns and query types, and visualizes search phrases for exploration and analysis.

Over the next couple of weeks, we will share examples of some of the interesting things you can do with Concentrate, including:

Pattern identification to condense the long tail into keyword phrases with similar structures. For example, here are some common search patterns from a cooking web site (the “[x]” represents a wildcard).

Patterns

Keyword visualization to show the connections between keywords and the relative performance of phrases. This wordtree shows the frequency of words within phrases (size) and average time spent on site (color).

Wordtree

Congratulations to Chris, Pete, and Sal for all their hard work, diligence, and creative problem solving to launch this solution.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

8 comments | Show all comments only the last 5 are shown


January 10, 2009
Daniel Waisberg said:

Looks amazing, I will implement it and start working for my own website. I think that for search marketing / SEO companies this will be a killer tool. It can add a huge value!


January 12, 2009
Bjoern Sjut said:

Hi,

has there already been testing with foreign languages? I could volunteer to integrate it with a German content heavy site to test the behaviour on umlauts, etc.


January 12, 2009
Bjoern Sjut said:

Oh, I can shed a light on this already: My most important keywords for our German sites are "error#" and "unicode error#" :-(


January 12, 2009
Pete Skomoroch said:

Bjoern,

Thanks for the feedback. I just fixed that unicode error for you and reloaded your list. Concentrate should run without errors on foreign languages, but some of the text processing components (stopwords, stemming, etc) are only fully supported in English at the moment. Let me know how the new results look and we will work on incorporating more international features.


February 27, 2009
Pauli Price said:

On the final validation stage, where I entered the bounce rate for my first keyword, the application met with an un handled exception because it couldn't find the google analytics keyword file. Perhaps because there were spaces in my site name? Unfortunately it also spit out all kinds of diagnostic information you probably don't want the casual observer to see. You really want to trap that unless the login is a privileged account.

Anyway, help doesn't go to a help screen or anything - it appears that clicking on 'help' brings one to the account page, so I figured I'd post my tale of woe here.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment