Airline and Airport Traffic and Delays: A JuiceKit Visualization Demo
By Sal Uryasev
September 28, 2009
Find more about:
visualization,
treemap,
juicekit
To fly is to be frustrated. If you've been traveling for long, you no doubt have your opinions about what airlines and airports are the biggest sources of suffering. Whether it is weather delays, getting stuck on the tarmac due to air traffic, maintenance problems, or missing a connection, it all feels outside of your control.
But a little knowledge can help. The Bureau of Transportations has maintained a giant database of air traffic information for decades of flights -- point of origin, flight times, flight delays, type of delay, etc. It is 72 gigabytes of data...just the type of data that needs some visualization. JuiceKit to the rescue.
We've put together a pair of visualizations that can make this data accessible to your average non-data-monkey traveler:
- Treemap uses size to represent the number of flights by airline and by point of origin. The color is used to show delay time -- we've got all sorts of delay metrics, each of which tells an interesting story.
- US map uses size to represent the number of flights and the color to display delay time. Filtering by airline yields additional details.
There are some interesting insights that pop-out when you build a visualization this data.
The different airline strategies are quickly apparent in the treemap. Hub-and-spoke airlines (Delta, Continental) have one or two dominant boxes (origin location), surrounded by lots of small locations. A point-to-point airline like Southwest looks entirely different with lots of similarly sized boxes.
Flipping between delay types uncovers some unexpected results. For example, you might expect weather delays to be heavily correlated by airport. The data shows something a little different: Comair appears to be abnormally impacted by weather delays -- as if a dark cloud chases around their airplanes. While Comair might be overstating weather delay data to prevent paying for meal vouchers, a more reasonable Wikipedia investigation suggests that Comair flies smaller weather-susceptible Bombardier airplanes.
A few details about this demo for our technical audience:
For those of you following JuiceKit development, this is a demo of some of the newer features available in our open source Juicekit 1.2 distribution, and some of the features that will be coming to the 1.3 version. Treemap styling is now elegant, crisp, and allows for white borders, fixing a couple rendering bugs. There is a new tree-level depth feature that can make it easier to navigate treemaps with lots of layers. The airports map demonstrates a geographic layout built using GeoLayout JuiceKit and Flare components. A major improvement demonstrated by the airline-selector dropdown is the ability to keep nodes consistent between data reloads. This allows us to animate the nodes even though they are generated by our new LiveQuery component.
Vasco de Gapi: Google Analytics API Explorer
By Sal Uryasev
April 29, 2009
Find more about:
googleanalytics
api
python
Are you ready to explore the Google Analytics API?
At Juice, we were very excited about the public release of the Google Analytics Data Export API. Our product Concentrate has been running on a hackish home-brew Google Analytics export tool since its release last November, and we were happy to be able to relaunch as a Customer Example of the Google Analytics Data Export API.
Today, we are releasing a new, free tool called Vasco de GAPI. Vasco is a web-based tool for exploring the API, for downloading complex slices of data using the API, and to even automatically generate code that will allow coders easy replication of the API calls in question. Instead of describing it in more detail, I am just going to demo it.
I am going to start with a relatively rare but curious functionality of Google Analytics. I keep track of who wrote each blog using a Google Analytics user-defined setting that is set to the author's name for each specific blog post. Slicing our blog by author can be cool for me as an employee so that I can brag during my yearly review about how many visitors I bring in or what natural search visits we get for free as a result of my posting. For the demo, I'm going to discover the natural keywords that bring traffic to my blogposts on the website.
Let's get started.
The first step is to authenticate using Google's OAuth system.

I select ga:keyword as a dimension.

ga:pageviews is the metric I am interested in. The results will automatically get sorted by the first metric, so I do not need to explicitly specify a sort value.

I set ga:userDefinedValue as a filter, and filter it to saluryasev, and select this last week as a reference point.

Here is the list of parameters that Vasco de GAPI is passing to google.

What are my results?

It turns out that of all my posts, the Google Trends API that I put out about a year ago drives the most natural traffic to our site. Hopefully, this will change with a few more blog posts, but this is still rather interesting data. I could target that specific audience with something Google-trendy. On an unrelated note, a slap to my face was that Zach's name sent fifteen users to my blogposts. Go figure. Sixteen users searched on my last name, and were probably looking for my more popular father.
To get at the rest of the data, I can click the download link at the bottom of the page or, for developers, another link downloads working code that will replicate this exact pull.
Vasco runs using an open source Python gdata wrapper for the API that can be downloaded here. This wrapper is powerful, and I will write another blogpost about it next week. It is plugged into the Google gdata module, and as such allows all forms of authentication available to gdata users, including OAuth, AuthSub, and clientside.
Hopefully, Vasco de GAPI can help all other potential explorers sail smoothly through the API. When it comes to data, Google is just an great company. They have had powerful APIs for most of their major services for years, and while the Analytics API is a latecomer, it actually is more powerful than the analytics interface itself. This sort of openness is something to be envied by all other analytics and web companies in the market.
By the way, please let me know if the explorer theme works well. It was a lot of fun working on a project with a slightly esoteric approach.
2 comments
Toby Murdock said:
really cool.
congrats zach & team. :-)
Dirnov said:
Amazing! Not clear for me, how offen you updating your www.juiceanalytics.com.
Dirnov
Add a comment
Enhanced Google Analytics: Firefox Plugin
By Sal Uryasev
April 13, 2009
Find more about:
search,
google
analytics
There is new life in the tool that shows change in Google Analytics. A year after releasing our Greasemonkey script, we are pleased to release an updated version of the Enhanced Google Analytics script as a free Firefox Plugin. For those already using the older Greasemonkey script, you can skip ahead to the What's new? and How do I get this plugin? sections of the page. For the rest, you may be wondering: Why does my Google Analytics need change?
Change, and why it is important
When I first started working at Juice Analytics, my boss Zach showed me a part of his daily Google Analytics routine. He would open up the Referring Sites page, glance at all of our 942 referrers. Using his superior intellect and capacity for remembering random urls, Zach would discover interesting deviations in the traffic from sites linking to our blog.
Our top referrers looked more or less similar day to day. Even once you get past the more recognizable top sites such as Twitter and Google, the various somethingblog.com pages, without context, often look a lot like somethingelseblog.com. To top it off, most of the information is not even specifically interesting. Our chartchooser.juiceanalytics.com domain sends us consistent regular referrals, but so what? Day to day, I don't even really care about Google or Twitter unless something changes. With change, I know whether someone has posted something new about me, sending valuable traffic. A good read on the topic is Avinash's rant about "actionable analytics".
Our Firefox plugin is designed to allow analysts to get more action out of what changed in the Referring Sites and Keyword Reports. Here are a couple examples of the plugin in action from our Google Analytics account:

What's new?
Our focus for this release has been to improve functionality, to reduce the barrier to entry for new users, and to allow automatic updates for the plugin. The new version of the script works nearly instantaneously, and the installation involves only two clicks (in contrast to the 7 clicks of the Greasemonkey version). As a Firefox plugin, updates are now automatic and require no reinstall. Keyword sensitivity has been raised to 50% for consistency. As a slight bonus, the design and layout of the form and buttons is now sleeker and the table stands out in a pretty Google blue.
Greasemonkey itself is no longer required for the plugin, but you may want to keep it around for any of the other cool scripts available from the community. If you ever find yourself wishing that something about the web looked different, acted different or had different functionality, there may be a Greasemonkey script to ease your pain.
How do I get this plugin?
First, you need Firefox 2.0+.
If you are a user of the equivalent older Greasemonkey version of this script, you may want to go ahead and uninstall it. Go to Tools=>Greasemonkey=>Manage User Scripts..., select Google Analytics Downloader, and uncheck the Enabled box.
If you never had the script installed, or once you removed it, simply click here to go the mozilla addon site, select the checkbox and click the button. Once installed, navigate to Google Analytics, and go to either the Referring Sites or Keyword pages, and click the blue button.
Happy analyzing!
39 comments | Show all comments only the last 5 are shown
Andrew said:
You claim that it is for Firefox 2.0+. Do you plan on supporting 3.5 anytime soon?
Sal Uryasev said:
Hey Andrew,
I just test the add-on against latest 3.5 Firefox, and there were no conflicts, so I enabled add-on installation for the latest beta versions. I think the addons.mozilla site takes up to 24 hours or so to propagate changes, so check back then. Thanks for pointing this out.
-Sal Uryasev
Bjoern said:
Quick question: The most current version I see is 0.54 - and that was released on March, 19. Is this just the blog post following up on that release or am I missing a newer version somewhere?
Best, Bjoern
Bjoern said:
Oh, and: The plugin doesn't work correctly if G.A. is run in another language than English, e. g. German - I get empty tables on that occasion.
Ophir said:
Hi,
There's a bug - it takes the 1st account in your account access list and pulls data on that one only, regardless of the account you selected to view.
Ben said:
Hi,
I'm getting the same bug as Ophir. The plugin is only displaying data from the first profile of my first account regardless of the account I'm viewing.
Matt said:
Yep, same bug here too. Uses first account no matter what.
Brandon said:
I just installed "Enhanced Google Analytics: Firefox Plugin" I'm a visual learner and I don't see the tool bar, icon..etc. How do I use the Enhanced Google Analytics: Firefox Plugin to get key words and see the analytics/states
Thank you,
Zach said:
The "Who sent me unusual traffic?" button is displayed in the top bar of the Referring Sites and Keyword Reports, next to the Email button.
Sal Uryasev said:
Thanks for all the comments regarding the first account bug!
I finally managed to overcome the yoke of many other projects, and submit a fix.
The fix won't roll out automatically until the mozilla people finally approve the addon, but the 0.55 version can be installed for existing users this link:
https://addons.mozilla.org/en-US/firefox/addons/versions/11120
Hideki said:
Thanks for such a useful plugins.
3 "Who sent me unusual traffic?" buttons are displayed in Google Analytics. I had installed your plugins before. Is this a cause?
I uninstalled the plugin and then I installed the plugin again but still 3 buttons are seen.
Is there any way to solve this?
I would appreciate if you might help me.
Thanks,
kenan said:
I have 3 buttons on the top too. The first button works the others don't. Still the function of the script seems ok. So prolly it's just a cosmetic bug.
Indurango said:
Hello,
I installed "Enhanced Google Analytics: Firefox Plugin"
And it works fine. Thanks.
BUT, is there a way to change the analyse going back to 30 days instead of 3 days with the new firefox plugin ?
Regards,
Indurango
jamed said:
Hey thanks for the great write up, I wasn't familiar with the new GA plugin. I installed and was wondering if there was any other way of going back further in the past other than 3 days??
Jamedy
http://www.academyX.com
Sal Uryasev said:
Hey Indurango and Jamed,
The current version does not have that option, but I plan to add it into the next incarnation of the extension. The only way to get at a different number is to go back to the older Greasemonkey version mentioned at the top of this post. The Greasemonkey version has an option for changing the 3 days to a different number.
-Sal
Tobias said:
Nice plugin. If it could also change the Analytics report to display the complete referrer path right away instead of having to click on the domain it would rock. :)
Edwin said:
How can I return results for non-paid keywords only? When I try to select them, this add-on still shows the cpc keywords in the results as well.
Sunaina said:
Hi,
I just added the plugin and I see the "Who sent me unusual traffic button" on the said pages. But I don't understand where to spot the information the plugin is supposed to provide. I dont see anything as shown in the screenshots above and my data doesn't look very different. Please advise.
Bernardo Contopoulos said:
Hi Sal.
Thank you for the plugin. I was wondering though why it shows three exact same buttons on the interface of my analytics. However, only one works (clickable).
Tim said:
Hi,
Great idea for a plugin, this will be really useful, thanks very much for making it available.
It seems to work properly for me, but I have a question on how the calculation is made - could you let me know exactly what is meant by '50% higher/lower traffic over the last 7 days'?
I have looked at some of the keywords in the list it brings up and have compared their visits for the past 7 days with the 7 days prior and found the change to be less than 50%. Am I misunderstanding the calculation?
Many thanks,
Tim.
John said:
Love the plugin. Is there a way to get the data for the date range in analytics rather than 3 or 7 days depending on the report?
Thanks,
John.
James said:
I click the button, the page blinks (button says 'loading') but then the data disappears and its back to the same page again.
Is something supposed to stay on the screen? I'm using the latest version of Firefox...soooooo..
Dave said:
tried the plugin, did not work.
Sandra said:
Unfortunately the plugin is not working for me. After I click the button it just displays a 'Loading' sign, but doesn't load.
Ben said:
Is there a version that's compatible with FF 3.5.2?
Brent said:
Hi,
I'm on FF 3.5.2 and the plugin doesn't seem to be working. Anyone else having issues, or am I having isolated problems?
Richard said:
work for me people, ff 3.5.2!
just logon Analytics and click on button "who send me unusual traffic?" and poof! have fun
Brent said:
hmmm yeah, definitely not working for me on FF 3.5.2 after a few separate installs/uninstalls. It must have a conflict with one of my previously installed addons.
Brent said:
oh jeez... i'm a n00b. for everyone else out there, who like me, thought the button would look the same/be in the same place, it is actually at the top of the page now - right in between the "email" and "add to dashboard" buttons - it's not in the same place it used to be! So, I was incorrect that this wasn't working in FF 3.5.2 - it is I that is not functioning correctly :P
Mel said:
Doesn't work for me. Didn't work for me before when it required Greasemonkey either. FF 3.5.2 Cookies are enabled.
angela said:
Lovely plug-in. Well, when it works. It installed fine on my 5-year-old XP laptop. But on my 64-bit Vista, it ACTED like it was installing fine but I never see the "who sent me unusual traffic" button.
I'm using FF 3.5.3. Any known issues? Should I play around with removing other plug-ins?
Debbie said:
Isn't working for me either. I have Firefox 3.0.14. I have the button but it doesn't seem to do anything.
Dan said:
Doesn't seem to work for me either. Are there known conflict with FF 3.5.3 or maybe even compatibility issues with Better GA plugin?
Debbie said:
Yes, definitely a compatibility issue with Better GA plugin. I had the same problem and just disabled the Better GA plugin and now it works fine.
Phil said:
I'm a bit confused....It says "Referring sites with 50% higher traffic over the past 3 days." What is the baseline for the calculation? 50% higher than when? Does it look at the average amount of traffic referred going back forever? And then look at the current time frame selected in GA and look at the difference?
e.g. If google usually sends my site 1,000 visits a day but sent my site 1,500 visits on Tuesday 11/17 (yesterday). Is that the 50% increase?
What if I want to analyze traffic from a month ago?
Mark said:
Not compatible with FF 3.5
Lee said:
sooo why the heck doesn't GA do this out of the box??
wa said:
please update the plugin for ffox 3.6
mariusz said:
please update to the newest firefox - this plugin is simply brilliant
Add a comment
Mashing Google Analytics With External Data
By Sal Uryasev
June 9, 2008
Find more about:
googleanalytics
reporting
google
A couple months ago, we put together a Greasemonkey tool that sucked data out of Google Analytics, and after mining it for trend information, integrated it back into the GA interface. This week's tool combines and extends Google Analytics with data from an outside source.
Here is a quick alpha of our Greasemonkey integration of external data reporting into Google Analytics for Kampyle, a "feedback analytics service." Click on the images to zoom in.
Clicking on the 'Kampylize' tab queries the Kampyle site in real-time to populate the standard GA data table.
Our friends at Kampyle run a service that allows website owners to put a feedback button on individual pages of their website. All information submitted by the user is uploaded to a central Kampyle database that compiles the user feedback with web page url and standard internet statistics such as the name of the browser. Website owners can access a server-end service that consists of a reporting site complete with summary data tables, graphs, and charts.
Since both sites are web-based reporting suites segmented in a similar fashion (individual website, date, web browser, etc.), they integrate together naturally. There is a lot of value in placing related data side by side, allowing users to get a more holistic picture of web site performance. If you have other ideas of data sources that would fit neatly with Google Analytics, let us know and we'll consider building the integration.
If you're interested in technical details, continue to Open Juice to see how this is all accomplished...
Tufte-Style Comparison Chart Generator
By Sal Uryasev
May 6, 2008
Find more about:
tufte
pil
comparison
chart
generator
Last week, we shared a rendition of a Tufte graphic using just a few lines of Nodebox code. As our commenters pointed out, Python is great, but it may not be every business analyst's carnal desire to learn a programming language just to generate some nifty graphs. I spent some time to push Chris's Nodebox rendition into a PIL-based Windows tool that can generate the same sort of comparison graph from an Excel file on the fly.
The result is The Comparison Chart Generator 1.0. The installation instructions are relatively simple. Unzip the zip file, and run comparisionchartgenerator.exe.
Alternatively, we have a new excel chart that creates the same effect using only excel functionality. Download the Excel Tufte Line Chart here.
If you are using the Chart Generator, start with some data in an Excel (xls) or Comma Delimited (csv) format. The data for this graph has to be contained within the first sheet starting with cell A1, as in the following picture.

Select an input file. There are a couple example files bundled with the download.

After selecting a file, you'll be prompted to modify a few of the basic options available for the chart.

Finally, save the result as a jpeg.

Here is the same image found in Tufte's textbook processed using the Comparison Chart Generator. It is generated using the csv example file bundled with the download.

Those of us who have undergone lasik eye-improvement surgery may still prefer the sharp crisp Nodebox results, but for the rest of us, this image looks pretty good. Let us know if this tool is useful. If there is enough of a positive response, we may consider expanding functionality for other fancy Tufte-esque charts.
If you do prefer Nodebox, I have an updated script here. This pushes the script up to 20 lines of code or so, but the extra 9 lines allow the labels to push themselves apart on their own. If you want to look at the source code for the Windows program, you can get it here. I used py2exe to compile it into an executable. The code, however, has not been thoroughly commented or cleaned as of yet, so edit it at your own risk.
21 comments | Show all comments only the last 5 are shown
lucas said:
Keep going, guys! I'm looking forward to seeing other Tufte-esque charts here.
And thanks a lot for the Nodebox, what a amazingly useful piece of software!
Asim said:
sal,
it took me a while to put all the pieces together. "using python...using excel..." but i realised that you may be interested in using resolver one:
http://www.resolversystems.com/products/
(i'm certain you've heard of it before, but let me describe it for the benefit of others)
it integrates a spreadsheet environment with a built in ironpython interpreter. that way, you wouldn't have to mess around with PIL and py2exe.
watch the one minute screencast:
http://www.resolversystems.com/screencasts/resolver-one-in-one/
and download it for free under a non-commercial license. big down side: only for windows (i'm a mac user, and don't enjoy working in a virtualised environment).
hope this is of interest to you, take care.
asim
Bilsko said:
Just tried it out on my Vista machine with Excel 2007 and it works great. Of course, I had to save the file as .xls so compchart could read it (it still baffles me that Microsoft had to go and introduce .xlsx as a file type...)
Rob said:
I just tried to run the .exe. file and got an error: "The specified module could not be found. Loadlibrary (pythondl) failed"
Any idea what this means and (more importantly) how to get around it?
Thx
johnny m said:
Awesome! However, all I get are export errors. But you have inspired me to begin to learn Python.
Traceback (most recent call last):
File "comparisonchartgenerator.py", line 247, in <module>
File "Image.pyc", line 1405, in save
File "JpegImagePlugin.pyc", line 409, in _save
File "ImageFile.pyc", line 493, in _save
IOError: encoder error -2 when writing image file
Madelaine said:
Cool, thanks. I might use this for gene expression data sometime.
derek said:
That's very nice. For extra sharp crispness, can you arrange for the imnage to be saved as GIF or PNG? Generally speaking, JPG is a very bad format to choose for graphs. The compression algorithm, which was designed for photographs with their smooth color gradients and few sharp edges, handles text, lines, and solid blocks, with their uniform fields of few colors, and many sharp edges, very badly, and the file is almost never as compact as a GIF acheives.
The image above shows the characteristic "newsprint smudged by fingers" visual effect of text in jpegs, and the file is 57K. You should find a lossless compression format both sharper in appearance and smaller in size.
Sal said:
I picked JPEG as a default since the PNG format is less known within Windows. Functionality for PNG is already included in the program, but is not obvious. When you are offered to save the file, ignore the *.jpg suggestion, and simply name it "whateveryouwant.png". You will have the output converted into the right format.
The GIF format is also built in if you want to try it out, but for some reason the PIL library that I used has not been creating great-looking GIF images. I would avoid them. The PNG looks very nice though.
derek said:
Thanks. Unfortunately, it may call itself a PNG, but it's still got jpeg artifacts. Also, bizarrely, the pseudo-PNG comes out at 60K compared to the jpeg's 40K.
There's no reason for such a simple graphic to have that kind of bloat. At the risk of tooting my own trumpet, see <a href="http://i146.photobucket.com/albums/r264/del_c/politics-charts/DoDDeaths3.png">this 800x600 graph</a>, which I think packs a fair bit more info into only 13.5K.
(and the <a href="http://i146.photobucket.com/albums/r264/del_c/politics-charts/DoDDeaths2small.png">400x300 thumbnail version</a>, designed to fit into the narrow column of a blog, is a mere 3.9K!)
Chris Gemignani said:
Derek,
We've had a number of problems getting a high quality image out of the Python Imaging Library (PIL). For this application, GIF would be best, but PIL was producing some ugly files.
Those graphics are really nice. Excel, too!
We use ImageMagick in house, but we can't package that in an app. A nice approach when using Excel is to output an image slightly bigger than you need then scale it down slightly with ImageMagick. This gives you anti-aliased lines and text that you don't get by default from Excel. It's what we used to produce the Colbert Bump graphs.
Nick said:
Hi,
This looks great! But for some reason the download link for the source for the windows version does not seem to work - I'd love to study the code, to learn how to use basic python to make my own tufte-esque charts.
Christian said:
Thank you for this post, it looks great! I love Tufte's work and read your blog frequently in Google Reader.
The output file (.png or .jpg) could be of a much wider use if it was a .wmf file, because this would enable me to change the colour of one line or text and make any additions I like with Illustrator. Is it possible to get a .wmf version? That would be fantastic.
Sal said:
Code should be accessible.
Most of the code deals with the GUI interface and with parsing excel/csv files. The actual PIL interaction starts around line 196.
I don't believe that PIL actually supports the wmf format. I am fixing up a presentable version of this sort of graph in Excel to add to the next version of chartchooser (http://chartchooser.juiceanalytics.com/). I'll put up a draft version of that when I have it cleaned up - it should be sufficiently editable to not need Illustrator.
Kasper said:
Great tool. One question: Is there a way to change the number of decimals shown? Currently it seems to show just on decimal, whatever the number format in the xls-spreadsheet.
Sal said:
As promised, I posted an excel chart of the same graph. You can find the link near the top of the page.
Jose Hernandez said:
I have an alternative post on a dynamic Excel bumpchart that combines charts with the cell grid. You can donwload it at http://sites.google.com/a/visual-catalyst.com/info_displays/Home/tufte_example_bumpchart.xls?attredirects=0
This display works for all versions of Excel. I'm working on a how to that describes how you can extend this type of chart.
Christof said:
Excellent work. I'm impressed!
John said:
awesome - using it right now. More Tufte style charting programs please!
Andrew said:
Can you do a chart with more than two columns?
Ahem. said:
I think you're missing the point Edward Tufte was making when he made his original chart. Because he took into consideration that the data was all going in the same direction (down) he was able to design a chart where it was pre-planned that there wouldn't be any x's or crossing lines.
(See http://nymag.com/daily/entertainment/2007/06/edward_tufte_and_the_triumph_o.html)
Edward Tufte would find another solution to the data above.
Travis said:
<quote>
"Because he [Tufe] ttook into consideration that the data was all going in the same direction (down) he was able to design a chart where it was pre-planned that there wouldn't be any x's or crossing lines.</quote>
Not true. Do some googling on Tufte and "bumps chart" or "bumps races" for great examples
Add a comment
Earlier writing










4 comments
Hadley Wickham said:
For other explorations and visualisations of this dataset, see the 2009 ASA data expo: http://stat-computing.org/dataexpo/2009/posters/
Jon said:
When viewing the treemap grouped by Airports, it would be fantastic to have two data label options: full name of the airport or the IATA code. It makes it easier for those that have traveled enough to identify some airports by their three-letter code than their name.
Chris Gemignani said:
@Hadley: Thanks for the reference to the ASA papers. I'm a fan of some of the small-multiple displays and SAS's heatmap was nice.
However, an animated display--like ours--that reveals information progressively is approachable and explorable in a way that the posters aren't. Media matters!
Sal Uryasev said:
Hey Jon,
I like your idea, and I implemented it in a slightly modified format. Thanks!
said:
Add a comment