Depth and Discovery: Powering Visualizations with the Google Analytics API

At Juice, we work with web analytics APIs large and small, from Google, comScore and Omniture. The Google Analytics API is our favorite. It powers the world's best, most widely deployed analytics site. And it powers Juice products like Concentrate (innovative search analytics) and Vasco de Gapi (a tool for exploring the Google Analytics API).

We were approached by the Google Analytics API team to find ways to explore new ways of looking at data with the API, and we were excited by the possibilities. We've been working on our own visualization framework, JuiceKit, that integrates the power of the Flare Visualization Library with Adobe Flex.

The result is Analytics Visualizations, two visualizations powered by the Google Analytics API that are free to use. You just need a Google account with access to Google Analytics data to explore your own data.

Analytics Visualizations Home Page

Referrer Flow

Curious about what sites are linking to you and what content is benefitting the most? Referrer Flow answers those question and shows how results change over time. Here is a brief video introduction:

Referrer Flow is a stream of daily treemaps showing pageviews and bounce rates for various groupings of your website's pages. You can group by combinations of page title, referrer and url. Clicking on the treemap will filter all the data by the page, referrer or url that you clicked on. Click again to clear your filter.

Keyword Tree

A list of top keywords isn't enough to really understand how people are searching and finding your site. Keyword Tree visually displays the most frequently used search keywords and how they are used together. Here's a video overview:

You'll see a frequently used search term at the center and the words and phrases that are most often used in combination with that word. Pick a different starting word by typing into the box in the upper right or selecting from the top word across the bottom of the screen. The words are sized by their frequency of use and colored by bounce rate (or % new visitors or average time on site). Roll over a word to see details about that combination of connected words.

Depth and Discovery

In designing these visualizations we focused on the question: how can we let users uncover the unexpected? That means designing targeted visualizations focused on limited well-defined issues. The Referrer Flow monomaniacally focuses on a single question "What pages are people viewing on your site and where are they coming from?" The Keyword Tree is laser-focused on word ordering and what that means for keyword performance.

The Google Analytics reporting tool is a great general-purpose reporting solution. It gives the advanced users everything they need to answer specific questions. However, its generality means it has limited ability to focus on two issues; depth and discovery.

The Google Analytics API is Google's solution to this problem. It's an opportunity both for businesses like ours that can create new ways of analyzing data, and for large sites that can use the API for integration, custom analytics, and more.

Thanks to Nick Mihailovski at Google for his gracious support, help and encouragement and Avinash Kaushik for inspiring this idea.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

8 comments | Show all comments only the last 5 are shown


December 15, 2009
DSLR said:

I'm new with GA and co. but your tool is really useful, at a glance you can read a lot of things...To improve readability of both views (referrer flow in particular) you can add a "loupe", a magnifier on screen movable by mouse to expand details of the chart. Thanks again from Italy!


January 4, 2010
Affan Laghari said:

Hello,
Excellent tool though it doesn't need my praise! It would be very helpful though if you can add an option to select start/end dates and some conversion metric. That can help find valuable patterns over longer periods.
Btw, I found you people from Avinash's blog and have been roaming around on your other tools namely Vasco de Gapi, Concentrate Me and JuiceKit. Rare to find such intelligent tools. Please keep up the good work.


January 4, 2010
yulia said:

Hi guys, found your site through Avinash's blog. I love the keyword tree tool. Been playing with it all day...

Question -- is there a way to print the trees? Also, is there a way to scroll? Those would be nice functionalities... Sorry if they are already there and I'm just too slow to find them :)

Thanks for the great (and really useful) tools!


March 4, 2010
Jean said:

Hello,

Is Juicekit still actively developped ? In the git repo the last commit date is August 30, 2009.

Does juicekit work with flex 4 and Flash Builder 4 beta ?

Hope this fantastic tool will continue to improve.


March 4, 2010
Sal Uryasev said:

Hey Jean,

Juicekit is under very active development, as we actively use it internally for all our work. If you investigate the unstable branch, you will notice a number of new features and improvements. There is also more work on an internal branch that should get merged in. As far as I know, we do not have a stable release ETA, but I know that we want to do one.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Delivering Data in Excel: The DTP Framework

Here at Juice we build fewer Excel dashboards than we used to. Excel itself is a decidedly imperfect vessel for any serious development--it's simply too easy to veer off of the disciplined track onto the underbrush.

Even so, Excel remains a playground where we can do surprising things. For instance, check out our Excel lightbox and an Excel tagcloud. We could appropriate everything that you find on the webbiest of Web 2.0 websites and build our Uruk-hai equivalents.

The key to staying on the rails when building Excel tools--either dynamic dashboards or simply to explore data--is discipline. At Juice, we use a methodology that we call "DTP" (Data Tansform Present). The foundation of DTP is the rigorous separation of data from presentation. This is similar to a well-known approach when building computer user interfaces called Model-View-Controller. I'm going to cover some of the key principles and we'll follow up with an example later on the blog.

Data

Data is the raw material of any visualization or report. It needs to be easy to add data or change data without having to change anything else about your dashboard.

We store raw data with dimensions preceding metrics in blocks in separate worksheets. If you want to sound pretentious, you can call this "first PivotTable normal form". Key points:

  • Have one worksheet for each data source.
  • Call these sheets "Data", or "{Title} Data".
  • Place them at the end of your workbook.
  • Data is snug to the top left of the spreadsheet. This allows us to use dynamic ranges. Dynamic ranges let you add data and have it automatically incorporated in all PivotTables.
  • Ensure that column names are in the first row.
  • Place your dimensions before metrics. Dimensions before metrics

Transform

We use PivotTables to transform the data into the structure we need.

  • Call these sheets "Transform" or "XXXXXXX Transform".
  • Create one sheet for each issue that you are exploring. This doesn't mean that you will only create one PivotTable. You may have multiple PivotTables to support different views or perspectives on an issue.
  • Turn on "show items with no data" for row and column dimensions. Show all items
  • We are seeking predictability, we want to the PivotTable to always be the same size regardless of what the PageField filters are.
  • Place all the dimensions that aren't used as rows or columns in the PivotTable as page fields. Every dimension should have a home. All dimensions must have a home
    • Set all PivotTables to not store data and refresh on open. PivotTable settings

Present

The Presentation page copies data from the Transform page(s) and formats it for display. It also allows users to control what data is being displayed.

  • Build a user interface to interact with your data. There are many ways to let people interact with your data, but one of the easiest is to use a PivotTable as your interface. This is described below.
  • We use an in-house style guide for graphs that you can see in our Chart Chooser.
  • If the Presentation page is likely to be printed, preset the print range.
  • When copying data from the transformation page to the presentation page, blank values will come out as zeros. We use a simple formula, =if('Transform!A2'<>"",'Transform!A2', ""), to ensure that blanks remain blanks.

Using a PivotTable as your interface

A simple way to let people manipulate your data is place a PivotTable containing only PageFields but no data on the presentation sheet. A Visual Basic macro triggered to run whenever the PivotTable changes then pushes out any changes to the master PivotTable to all the PivotTables on your Transform sheet.

Here is the code to make this happen.

This drives our PivotTables in concert and ensures they stay in sync.


That's a basic overview of our DTP technique. You can try a simplified version of DTP here.

DTP Example.xls

We'll be back soon to talk through this example.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

14 comments | Show all comments only the last 5 are shown


February 26, 2009
Jacob said:

I am having trouble getting rid of zeroes from my charts. Where exactly does the formula =if('Transform!A2'<>"",'Transform!A2', "") go?

Thanks in advance


April 2, 2009
nicholas said:

You say that you don't use Excel that often anymore to create dashboards. What tools do you use or recommend these days to build dashboards?


April 2, 2009
Zach said:

Nicholas, Most of our dashboards are web applications using Flex and our open-source visualization library JuiceKit (www.juicekit.org).


June 10, 2009
Patrick said:

Wow - Thanks so much, I love it and this make life with Pivottables so much easier! Goes right into our weekly reports!
One question: I always thought I know Pivottables pretty good - but how do I add Pagefields without Data so that the blue frame does not show up like in the example file? Thanks for your help! I love your tools and have been an avid user of the Chart Cleaner for years now. :-)


December 12, 2009
shawnify said:

Typo in third paragraph: "DTP" (Data Tansform Present)

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Bubble, bubble toil and trouble

Recently we wanted to show how Concentrate, our new long-tail search analytics tool, could give you a view of search patterns across travel websites. As political junkies, we were inspired by this chart from our friends at the NY Times.

NY Times candidate word bubble chart

The first tool we tried, simply on principle, was Excel 2003. As expected, making a NY Times quality bubble chart in Excel 2003 is a hard problem. Here's a draft of how far I got before giving in to label fatigue.

Excel NY Times bubble

The bubbles themselves aren't tough, but getting the labels right is hard. I'd love to see a solution, so if any reader wants to tackle it eternal fame can be yours. Here is a CSV if you want to try.

travelpatterns.csv

Another of the tools we use at Juice is NodeBox, which we used to make this:

Concentrate pattern comparison

Here's the code that made the graph.

The power of a programmatic approach like this is that by changing a line or two, you can get the following. Click for a larger version. Click the text for the code..

With great power comes a great need to exercise restraint. Otherwise you end up like these poor chaps. Must... flex... restraint... muscles...

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

17 comments | Show all comments only the last 5 are shown


January 16, 2009
chip said:

Rob Bovey has an xy chart labeler that may have helped on the original Excel version. I use it a lot and it provides a good degree of flexibility on placement.

http://www.appspro.com/Utilities/Utilities.htm

The labels are not dynamic which is a drawback. It works on other types of charts too.


January 18, 2009
Andy Cotgreave said:

Hi Clint,
Yes, I did initially add the text. However, in Tableau it somewhat overwhelmend the circles. I did try to format the text to grey and shrink it, but the text only served to confuse things.


January 19, 2009
Chandoo said:

Hi Chris,

Good stuff...

I have tried the same in excel while keeping the labels right (I guess so). You can take a look at the chart and downloadable excel here: http://chandoo.org/wp/2009/01/19/excel-bubble-chart/

Let me know your comments


February 9, 2009
David Franta said:

Didn't really find another place to post this, but interesting article posted by Cringely (ZDnet fame) about how JP Morgan mangled a bubble chart recently -

http://blog.cringelysmortgage.com/2009/01/29/whats-wrong-with-wall-street/


February 22, 2009
Mike Chelen said:

How about using the Google Charts API scatter plot? http://code.google.com/apis/chart/types.html#scatter_plot
It allows variable bubble sizes, and has been used in some similar charts such as http://www.xefer.com/twitter

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Setting DJANGO_SETTINGS_MODULE

Here's a bash function I use for Django development to quickly set DJANGO_SETTINGS_MODULE.

function setdsm() { 
    # add the current directory and the parent directory to PYTHONPATH
    # sets DJANGO_SETTINGS_MODULE
    export PYTHONPATH=$PYTHONPATH:$PWD/..
    export PYTHONPATH=$PYTHONPATH:$PWD
    if [ -z "$1" ]; then 
        x=${PWD/\/[^\/]*\/}               
        export DJANGO_SETTINGS_MODULE=$x.settings
    else    
        export DJANGO_SETTINGS_MODULE=$1 
    fi

    echo "DJANGO_SETTINGS_MODULE set to $DJANGO_SETTINGS_MODULE"
}

I put this in my .bash_profile, then a quick setdsm sets the DJANGO_SETTINGS_MODULE to the settings.py in the current directory and add the current directory and it's parent to PYTHONPATH.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

4 comments


October 16, 2008
Ro said:

Hi Chris,

1. i put the script into .bash_profile, which i had to create first
2. then i opened the konsole, entered into my projects dir which is on a windows drive (sth like /media/sda2/Document .../..),
3. called `django-admin.py runserver`

but i got the error "Settings cannot be imported because environment variable DJANGO_SETTINGS_MODULE is undefined"

any suggestion what might be the problem?


November 20, 2008
Boffo Bob said:

"windows drive" is the problem.

wtflol


November 21, 2008
Not Boffo Bob said:

Works - excellent.


February 19, 2009
Sampath Girish.M said:

Hi Chris Gemignani,
I am getting an error while running the following command 'python manage.py shell'
as:
function setdsm() error at '()'
I copied the function you placed here and got that error. Is there any rectification for the above problem, u can help me out and I will be very much thankful to u......

If possible post a reply to my email id please..... Its girishmsampath@gmail.com
so that i can solve my problem faster.

Thanks,
Sampath Girish.M

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment