Vasco de Gapi: Google Analytics API Explorer
By Sal Uryasev
April 29, 2009
Find more about:
googleanalytics
api
python
Are you ready to explore the Google Analytics API?
At Juice, we were very excited about the public release of the Google Analytics Data Export API. Our product Concentrate has been running on a hackish home-brew Google Analytics export tool since its release last November, and we were happy to be able to relaunch as a Customer Example of the Google Analytics Data Export API.
Today, we are releasing a new, free tool called Vasco de GAPI. Vasco is a web-based tool for exploring the API, for downloading complex slices of data using the API, and to even automatically generate code that will allow coders easy replication of the API calls in question. Instead of describing it in more detail, I am just going to demo it.
I am going to start with a relatively rare but curious functionality of Google Analytics. I keep track of who wrote each blog using a Google Analytics user-defined setting that is set to the author's name for each specific blog post. Slicing our blog by author can be cool for me as an employee so that I can brag during my yearly review about how many visitors I bring in or what natural search visits we get for free as a result of my posting. For the demo, I'm going to discover the natural keywords that bring traffic to my blogposts on the website.
Let's get started.
The first step is to authenticate using Google's OAuth system.

I select ga:keyword as a dimension.

ga:pageviews is the metric I am interested in. The results will automatically get sorted by the first metric, so I do not need to explicitly specify a sort value.

I set ga:userDefinedValue as a filter, and filter it to saluryasev, and select this last week as a reference point.

Here is the list of parameters that Vasco de GAPI is passing to google.

What are my results?

It turns out that of all my posts, the Google Trends API that I put out about a year ago drives the most natural traffic to our site. Hopefully, this will change with a few more blog posts, but this is still rather interesting data. I could target that specific audience with something Google-trendy. On an unrelated note, a slap to my face was that Zach's name sent fifteen users to my blogposts. Go figure. Sixteen users searched on my last name, and were probably looking for my more popular father.
To get at the rest of the data, I can click the download link at the bottom of the page or, for developers, another link downloads working code that will replicate this exact pull.
Vasco runs using an open source Python gdata wrapper for the API that can be downloaded here. This wrapper is powerful, and I will write another blogpost about it next week. It is plugged into the Google gdata module, and as such allows all forms of authentication available to gdata users, including OAuth, AuthSub, and clientside.
Hopefully, Vasco de GAPI can help all other potential explorers sail smoothly through the API. When it comes to data, Google is just an great company. They have had powerful APIs for most of their major services for years, and while the Analytics API is a latecomer, it actually is more powerful than the analytics interface itself. This sort of openness is something to be envied by all other analytics and web companies in the market.
By the way, please let me know if the explorer theme works well. It was a lot of fun working on a project with a slightly esoteric approach.
Real-World Tufte Graphics in 11 Lines of Code
By Chris Gemignani
May 2, 2008
Find more about:
tufte
graphics
design
nodebox
python
One of the troubles with Tufte is the frustrating infeasability of his approach to design for real people in business. One of his recommendations is to use Adobe Illustrator.
Adobe Illustrator is a big serious program that can do almost anything on the visual field (other than Photoshop an image). Most of my sparkline work was done in Illustrator. Fortunately all graphic designers and graphic design students have the program and know how to use it, so find a colleague who knows about graphic design.
Raise your hand if you have a graphic design assistant at your beck and call. I thought not.
One of the tools we use for rapid prototyping at Juice is NodeBox.
NodeBox is a Mac OS X application that lets you create 2D visuals (static, animated or interactive) using Python programming code and export them as a PDF or a QuickTime movie. NodeBox is free and well-documented.
All true. But it's more helpful to think of NodeBox as a free Adobe Illustrator that you can program in the world's easiest programming language. Oops, here's the right link.
I wanted to see if we could reproduce the following graph from The Visual Display of Quantitative Information, p 158.

Here's the code. It's 11 lines of code if you exclude entering the data and setting things like fonts and colors.
size(500,700)
font('Palatino');
fontsize(12)
stroke(0.4) # a medium grey for lines
fill(0.2) # a slightly darker grey for text
<h1>data = (label, first, last, label-fudge-factor)</h1>
data = [ ('Sweden', 46.9, 57.4, 0., 0.),
('Netherlands', 44.0, 55.8, .3, 0.),
('Norway', 43.5, 52.2, 0., 0.),
('Britain', 40.7, 39.0, 0., 0.),
('France', 39.0, 43.4, 0., 0.6),
('Germany', 37.5, 42.9, 0., -0.4),
('Belgium', 35.2, 43.2, 0., 0.),
('Canada', 35.2, 35.8, .8, 0.4),
('Finland', 34.9, 38.2, -0.5, 0.),
('Italy', 30.4, 35.7, 0.3, -0.3),
('United States', 30.3, 32.5, -0.3, 0.),
('Greece', 26.8, 30.6, 0.4, 0.),
('Switzerland', 26.5, 33.2, -0.2, 0.1),
('Spain', 22.5, 27.1, 0., 0.3),
('Japan', 20.7, 26.6, 0., 0.), ]
text("Current Receipts of Goverment as a Percentage of "
"Gross Domestic Product, 1970 and 1979", 20, 70, width=215)
text("1970", WIDTH*.28, HEIGHT*0.03)
text("1979", WIDTH*.68, HEIGHT*0.03)
def ypos(val):
# calculate a vertical position by scaling between 10% and 90%
# of the height of the image
return HEIGHT * (0.9 - 0.8 * (val - minval) / (maxval - minval))
<h1>find the minimum and maximum values in the range</h1>
alldata = [d[1] for d in data] + [d[2] for d in data]
minval, maxval = min(alldata), max(alldata)
for label, start, end, startfudge, endfudge in data:
align(RIGHT)
text(label, 0, ypos(start+startfudge)+4, width=0.25*WIDTH)
text("%0.1f" % start, 0.25*WIDTH, ypos(start+startfudge)+4, width=0.07*WIDTH)
align(LEFT)
text(label, WIDTH*.75, ypos(end+endfudge)+4)
text("%0.1f" % end, 0.68*WIDTH, ypos(end+endfudge)+4, width=0.07*WIDTH)
line(WIDTH*.33, ypos(start), WIDTH*.67, ypos(end))
Here's what the result looks like.

We have some great followups to this planned for next week. We'll reimplement this code with the Python Imaging Library, which will open things up for Windows users. We have some great plans for mashing these graphics up with our just released Google Analytics API.
23 comments | Show all comments only the last 5 are shown
Clint said:
Chris,
You tout that "It's 11 lines of code if you exclude entering the data and setting things like fonts and colors"
How long did it take you to code and what's the comparable length of time for a designer in Illustrator?
Seems to me that savvy python scriptors are just as rare as designers so I'm not sure there's a winner here.
Asim said:
If you're looking at visualisation using Python check out R:
http://www.r-project.org/
and the corresponding Python package:
http://rpy.sourceforge.net/
Here are some examples of using R:
http://addictedtor.free.fr/graphiques/thumbs.php
I've used an R/RPy combination successfully in work and academic assignments. Once downside is that it's difficult to set up RPy on Linux/Macs.
Tony said:
I'm with Clint on this one. Six in one, half dozen in another... If you are a programmer, sure, Python would be the obvious choice.
My thinking is that people would rather learn Illustrator where the work is visible versus Python where it's a lot of unfamiliar characters in specific strings that translates into an end product.
Now to your credit, Python and Nodebox don't cost $599 like Illustrator. So that's a big plus if you have programming skills or want to learn them.
Chris Gemignani said:
@Clint/Tony:
It's leverage, leverage, leverage. The code solution lets me produce 1000 graphs for no more than the cost of producing one. It lets me produce next months graph for no more than the cost of producing this one. It lets me build an API like http://code.google.com/apis/chart/. Admittedly this takes yet more skills and experience, but the problem is getting easier not harder.
Of course there's a benefit to free and open source too. I don't need a purchase order, I don't need to talk to my boss, to get something done, etc.
The time spent for this project is probably about the same as Illustrator. It was about 10 minutes to get to basic, working code. Then an hour of extra primping to make it pretty for the blog. Frankly, I'm not really sure how someone would produce an accurate technical drawing like this in Illustrator. Tufte mentions a Excel data import function, but that sounds like extra complexity too.
@Asim: I'm aware of the R stuff. We don't use it, but it's great. This NodeBox approach is more pixel-perfect particularly if you're seeking a very specific look.
Sal said:
Code, however, is very reusable. Your programmer only needs to create it once, and then, with minor adjustments, any similar graph can be drawn by non-programmers.
Tony said:
@ Chris - Great point!
Nick said:
Speak of a coincidence - While browsing Tufte's site looking for advice on programs that make tables like this (actually looking for a way to reproduce the cancer survival rates ones) I followed a link to this site on cleaning up Excel graphs and end up finding what I was looking for in the first place! I was even thinking about learning to use python to do it too....
Scott Zakrajsek said:
First, I just wanted to say that I love the tips on this blog.
I'm really interested in the follow up and would like to see the flexibility and visual aesthetic of Python. I'm still a big fan of Illustrator, I think that knowledge of a few basic AI tools can deliver a large variety of graph types. Check out this link, a great example of clean data visualization done w/ illustrator:
http://feltron.com/index.php?/content/2007_annual_report/P0/
Brendan O'Connor said:
I actually use <a href="http://www.statmethods.net/">R</a> for static data visualizations like this. (e.g. a <a href="http://blog.doloreslabs.com/?p=11">color wheel of words</a>.) It's definitely a weird choice, but (1) I think its data management and mathematical list operations are easier than Python or Ruby, and (2) it has a small amount of GUI integration. I see that NodeBox is a bit better than PIL on those points though...
Shane said:
Great post. I'm very interested in hearing about other methods that dont require OS X.
Andrew said:
Very true insights on using Adobe Illustrator. My background isn't in graphic design, and I haven't spent a number of years taking courses in AI. I generally find that Adobe's software is obtuse and confusing. Perhaps it is easy to use for fanatics, but for occasional, reluctant users it's a nightmarish experience.
Thanks for providing an alternative for the rest of us.
Mike said:
For Linux users NodeBox can be run using QT, there is some info about installing at http://dev.nodebox.net/wiki/Qt
Using this method NodeBox is running fine for me but the code above shows an error:
Traceback (most recent call last):
File "/home/luser/try-qt/nodebox/gui/qt/__init__.py", line 534, in _compileScript
self._code = compile(source + "\n\n", self.scriptName, "exec")
File "<untitled>", line 7
<h1>data = (label, first, last, label-fudge-factor)</h1>
^
SyntaxError: invalid syntax
(not sure if this is a problem with the code or the NodeBox QT version)
Mike said:
Update- got QT NodeBox to run on Ubuntu 8.04 and run the updated script from http://media.juiceanalytics.com/downloads/tufte_nodebox_forcepush.py just fine!
The font('Palatino') command was still showing an error but it worked fine with that line removed ;)
Big thumbs up for Tufte on Linux using NodeBox :D
Sal said:
Whoa - nice find Mike. I have it running on Ubuntu 8.04, and will definitely use this in server-side applications.
I think I found the bug with the font setup in the Nodebox Qt code. If you open up /try-qt/nodebox/graphics/qt.py and go to line 884, and change 'return f.exactMatch()' to 'return f', the font feature works again. You can even download the Palatino font and point to it with the full path.
Pradeep Gowda said:
I've implemented a in-browser vresion of this graphic using Javascript and processing.js library.
http://pradeepgowda.com/programming/tuft-graphics-processingjs.html
Jonno said:
I'm a statistican and have had similar frustrations with implementing interpretable graphs. A common tool I would use for this is R - a free statistical programming language with excellent graphing capabilities. The code would be about the same length as Node box (at a guess).
http://www.r-project.org/
Chris Gemignani said:
Who's up for a multi-language infographics shootout?
Tim said:
That's cool !
I was wondering if there was a way to generate these graphics through command line ? that way we could embed this in web application and get the graphics generated dynamically
note: looks like comments in your code got converted to html (# -> h1)
Kragen Javier Sitaker said:
Is there a way to get old-style numerals with NodeBox? I suppose you have to find an installed font on your Mac with old-style numerals.
Pradeep's processing.js demo is awesome, but from the screenshot lacks antialiasing. (I'm not yet a Firefox 3 Achiever.)
Luke said:
Dude, why reproduce the errors ("fudge factors") in the original?
The Dude said:
@Luke: Dude, the fudge factors are not errors. They are there so that the text labels do not overlap.
Michael Galloy said:
I made an IDL implementation, the results are <a href="http://michaelgalloy.com/wp-content/uploads/2008/08/receipts.png">here</a>. It wasn't too bad to have it automatically compute the fudge factors (at least in simple cases).
Ahem. said:
I think you're missing the point Edward Tufte was making when he made his original chart. Because he took into consideration that the data was all going in the same direction (down) he was able to design a chart where it was pre-planned that there wouldn't be any x's or crossing lines.
(See http://nymag.com/daily/entertainment/2007/06/edward_tufte_and_the_triumph_o.html)
Edward Tufte would find another solution to the data above.
Add a comment
Kaizen and Juice 2.0
By Chris Gemignani
April 30, 2007
Find more about:
design
juice
python
Kaizen may be the the art of continuous improvement, but today we’re happy to showcase the art of discontinuous improvement. In one big bang, we’re introducing a new logo, a new website, and a new platform to deliver web services and tools to make your life better.
The new logo is the product of months of pixel pushing and brainstorming. I’ll detail the evolution of the logo in a future post, but for the moment I’ll leave you with a comparison of the old and new logos.
| old Juice logo | ![]() |
| new Juice logo | ![]() |
The website redesign is an effort to improve the “discoverability” of our site. Good articles were mouldering in the archives. It was hard to find old or popular articles. Search was barely existent. A follow up article will trace the evolution of the site design.
We built the new site using Python and Django. This is a dynamic platform that gives us a lot of power to add new features, tools, and applications. We’re excited about what we will be able to bring you—we have a whiteboard full of ideas just awaiting implementation.
The new site, while better, isn’t perfect. Despite our efforts, there may be links that don’t work or screencasts that neither screen nor cast. We'd love to hear your reaction to the new design. Please leave a comment to tell us what you think or if you find anything that's broken. We'll fix it right away. With your help, we’ll make this site and this community better in a process of continuous improvement—Kaizen.
We've gotten a lot of positive comments about the design. I wanted to thank rockbeatspaper, the web design consultants who worked with us to create this site. A great company and a terrific job.
30 comments | Show all comments only the last 5 are shown
Kristine said:
Congrats on the new launch. I've uncovered a few issues as I navigated around the site...
- unable to download Zach's B-eye presentation. Sent to Juice's error page.
- green links to other posts at the bottom of several articles are not working for me. Again, I'm sent to the error page.
Thanks for all the good info!
Chris Gemignani said:
Kristine,
Thanks! The green links issue is fixed. We'll fix the presentation link.
Bill said:
- The logo link doesn't link back to the main site.
- All links from the feed show up page not found.
- Link to Excel Chart Cleaner shows up page not found.
- Looks a little web 2.0 for an analytics group.
Patrick Ibison said:
-OMG! Is web 2.0 already passe? I am so far behind... you'd think an analytics group would have told me.
Mario said:
I don't want to spoil the party, but isn't the logo very similar to Amazon's (the arrow below the name)?
Great website by the way.
Michael said:
The new site looks great. All the cool blogs are going with the 3 column look.
I noticed that the Excel chart cleaner link is not working. boo hoo for me, now I will have to reformat all my charts by hand.
Jon said:
Hi,
I really like the filters on the right - really makes it easier to find relevant articles.
One problem I've spotted... the Excel lightbox download (from http://www1.juiceanalytics.com/writing/2006/11/lightboxing-images-in-excel/) doesn't appear to be working.
Cheers,
Jon
Kyle said:
Hey guys
I like the new design, much more accessible - except for the blue, which grates a bit. Also, on your people pages there looks to be an open link tag - a bunch of stuff under 'Contact $x' is a link to 'mailto'.
Cheers
Kyle
Chris Gemignani said:
3. I think the links in the feed will clear up
5. Mario, we went back and forth 100 times over whether the logo looks too Amazon! You're not the first to note the similarity. We'll talk about this more in the logo post.
All: The download links should be fixed.
Thanks!
derek said:
Nice! I have just a couple of points:
After reading these comments I hunted around the top for "Juice home page" or similar. I eventually found "Back t writing" at the bottom, which I don't think is as intuitive. I always ask bloggers to include a word or graphic link to their home page www.whatever.com, at the top left or top centre, in any other page. You may feel differently.
Have you considered keeping the front page fresh with "Most recent comments", listing author and topic at the top right? Kaiser Fung's Junk Charts does this very successfully, I think, helping to keep the conversation going and avoiding what Edward Tufte calls "recency bias" in discussions of older articles.
Chris Gemignani said:
Thanks Derek. You can also click on the "Juice" logo to go back to the home page.
When we asked readers what they wanted to see in the site, one of the most requested features was better access to comments. One option is a "most recent comments" streamer on the home page. I'm personally dissatisfied with this because I don't think there's enough of a way to show the context of the comment. Nonetheless, we may do this once the dust settles.
Another idea I like is to let someone subscribe to an email stream of comments on a particular post--most useful if you've posted and want to see the followup conversation without checking back.
Christian Westarp said:
The link to the screen cast for doubling up excel charts no longer works.
ken said:
Nice update... two navigational issues:
1. When you click on, say, Excel in the "By Topic" section, you get the first article only -- but there are 44 articles available. Is it possible to see all of them? Same goes for Monthly archives: I can see the post and the most recent 6 (in the box at the bottom) -- but what if there are 14 posts that month?
2. In the previous By Topic example, it takes you to the first post from that topic. Then you need to click on the "read comments" before you get the navigation back and forth to other articles (note: can't do much about it, but in this case, it is listed by date, not by topic). However, the middle option "Back to Writing" takes you to the home page. This seems a bit jarring as you'd expect it to take you to the original post, without comments.
Also:
1. It seems like the "Elsewhere" links show up on the home page, but not on other pages (just shows the feed).
2. Blog comments websites don't appear to be showing up.
Josh said:
Wow...what a shock today! Very well done fellas. And look at the little sparkline-like graphs over on the right to show who has the most postings, most postings in a topic, by date, etc. Talk about practicing what you preach. VERY nice touch! Keep up the good work.
Coe said:
I just found you guys, so can't compare to a previous design, but I like what I see. A few comments:
1. If the browser window is small, the "Juice" logo overlaps the "Solutions" item on the navigation menu. My window is not even that small, but it's not full-screen.
2. It is not very clear that when you are on a writing topic page, the right-hand topic list now shows the number of articles that have the specified topic *and* the topic that the page applies to. It took me a while to realize why I was seeing 44 Excel articles on the main writing page and only 10 when I happened to be looking at the analytics topic page.
3. Continuing from that thought, when I clicked on "Excel" from the analitics topic page, I expected it to show me that there were 10 articles - the intersection of "Analytics" and "Excel" rather than 44.
4. Echoing another post, when I click on a topic, I would like to see a list of all articles with that topic. I am really enjoying browsing the articles on this site, and would like to see more.
5. In the more-on-excel-in-cell-graphing article, the link "http://www.juiceanalytics.com/downloads/Excel%20in-cell%20graphing%20ideas.xls" appears to be broken (along with another .xls link that I saw somewhere but I forget where).
I wouldn't normally bother with a post like this, but this is such a high-quality site, I appreciate the opportunity to help make it even better.
Jeff said:
http://www.decilogratis.com/img/200612/1116_amazon-logo.jpg
Looks familiar :)
Great job guys, love it
Mary said:
Nice, but I miss the color orange. Don't you think something should be orange?
Chris Gemignani said:
A few notes on fixes:
- Blog commenter's names are now linked to their websites if they give one.
- Several download links have been fixed
- Screencasts are still broken
- Navigation (Previous article, back to writing, next article) has been partly fixed
Thanks to all. More fixes are coming.
peter said:
there is nothing juicy about the blue colour under the letters. the blue line does not show that u are "peeling" data to its utmost usefulness(unlike the orange). that street corner picture u had, i identified with it (people with one passion, solve client problem, pushing applications to the extreme), but great site, amazing content, well structured except the blue line. Inspiration to the world of data analytics initiates.....
Miguel Marcos said:
Hi. A link from the RSS file, "Top 10 Problems with Excel", is not working. It comes up with the following URL:
http://www1.juiceanalytics.com/writing/2007/04// and the text on the page is as follows:
"Page not found
The JuiceBox can't find the page you asked for. Maybe try a search."
Chris Gemignani said:
Peter, I miss the picture too. We used to have a rotating picture in the header--all things that evoked some aspect of analytics for us, and I miss that even more.
However, there was just no way to reconcile the pictures with this sleeker site design. They have reappeared on our business cards, but that's a story for another day.
We went back and forth about the orange in the logo. We heard far to many comments that the new logo mirrored Amazon and orange just made the parallel too stark. I don't buy the Amazon argument, but I'll make that argument another day too.
Thanks!
Chris Gemignani said:
Site fix update:
- All the broken links to images and downloads should be fixed.
- A bug in the comment system was fixed which caused all comments to attach to the most recent post.
- Links to screencasts are still broken
Thanks, everyone.
GrahamC said:
Nice to see you are experimenting, sadly doesn't work for me.
Your new logo makes it appear that you are now an Amazon subsidiuary.
I did have problems trying to search through the archives in the past, so any work to improve that is great.
But.
The new web layout is 66% cruft - 3 columns, only one of which i'm normally interested in (the article). Additionally it looks like you've dumped those giant boxes at the bottom of the page , just because you could.
Sorry, far preferred the old layout (especially with the nice random header images)
GrahamC said:
EEEK, even worse - i've been reading this through Netvibes and I went to the main page just to double check my views and uh, as a reader, I can't find a single button which takes me to yesterdays (this) article.
I've only now worked out that it's in one of the big boxes as 'recent'.
For being all about presenting information in the useable manner i think you need to drink your own kool-aid.
P.s. Still love the content though :o)
Chris Gemignani said:
GrahamC: I agree that getting to previous articles, especially in filtered views is a major usability problem. We'll fix it; give us time.
Chris Gemignani said:
There were a few changes to the blog today that should make frequent readers happy.
- The writing page now shows recent posts and comments.
- A few Internet Explorer CSS problems have been cleaned up as well.
Jonah said:
A few big complaints:
1) Bookmarked pages no longer work (permalinks changed, no redirects).
2) Can't browse through archives start to finish. There are 21 posts from Jan 2005. I can see one at a time. And can't see more than a few titles ahead.
3) No dates on posts in archives, so it's tricky to know if links are in fact, the archives I'm looking for.
After 20 minutes of looking for a bookmark on animated scatterplots, I stumbled across it: http://www.juiceanalytics.com/writing/2005/6/
Sadly, under the new design, the animation isn't there. Instead I get code: [FLASH] http://www.juiceanalytics.com/flash/tigerwoodsfinal , 440, 430 [/FLASH]
Juice is usually right on the money with presentation. But you have deviated from standards. Blog standards: date based archiving, categorical archiving, (scrolling across all stories in a given archive, abbreviated or full text), and individual archiving.
You've replaced standards with some filing system that pushes the most popular archives into view at the expense of all others.
David Parker said:
I've tried to get used to the new look - I have.
The functional layout is fine. However, I miss the hip looking photo banner. And the bold green titles look too squeezed together, heavily aliased and generally cheap and ugly.
Jon Peltier said:
I wondered what happened to this blog. The RSS feeds just stopped, but I never got around to visiting the site itself. Finally I found it today from Chris' post in another blog, and discovered that I'd missed several months of discussion. You should have sent out an announcement using the old RSS feed.
My first impressions of the new layout are positive, by the way.
kcmarshall said:
I spotted a bug and thought I'd report it.
The post-specific topic links don't work properly. For example, on this post the topics are "Design, Juice, Python".
The Python link is:
http://www.juiceanalytics.com/writing/?/writing/topics/python/
but should be:
http://www.juiceanalytics.com/writing/topics/python/
Regards!
Kevin
Add a comment
Python Geocoding Help
By Chris Gemignani
February 28, 2006
Find more about:
googleearth
juice
python
Yahoo recently released a nifty geocoder API that's free for small (<50,000 lookups per day), non-commercial applications. Rasmus Lerdorf (Yahoo's PHP king) has written a nice introduction to using this geocoder in your PHP apps. In that spirit, here's a cheap and cheerful Python class that we use to geocode addresses.
from xml.dom.minidom import parse
import urllib
class Geocoder:
"""
look up an location using the Yahoo geocoding api
Requires a Yahoo appid which can be obtained at:
http://developer.yahoo.net/faq/index.html#appid
Documentation for the Yahoo geocoding api can be found at:
http://developer.yahoo.net/maps/rest/V1/geocode.html
"""
def init(self, appid, address_str):
self.addressstr = addressstr
self.addresses = []
self.resultcount = 0
parms = {'appid': appid, 'location': addressstr}
try:
url = 'http://api.local.yahoo.com/MapsService/V1/geocode?'+urllib.urlencode(parms)
# parse the xml contents of the url into a dom
dom = parse(urllib.urlopen(url))
results = dom.getElementsByTagName('Result')
self.result_count = len(results)
for result in results:
d = {'precision': result.getAttribute('precision'),
'warning': result.getAttribute('warning')}
for itm in result.childNodes:
# if precision is zip, Address childNode will not exist
if itm.childNodes:
d[itm.nodeName] = itm.childNodes[0].data
else:
d[itm.nodeName] = ''
self.addresses.append(d)
except:
raise "GeocoderError"
def repr(self):
s = "Original address:n%snn"%self.addressstr
s += "%d match(s) found:nn"%self.resultcount
for addr in self.addresses:
s += """Match precision: %(precision)s
Location: (%(Latitude)s,%(Longitude)s)
%(Address)s
%(City)s, %(State)s %(Zip)s
""" % addr
return s
if name == "__main__":
sample_addresses = ['555 Grove St. Herndon,VA 20170', '1234 Greeley blvd, springfeld, va, 22152', '50009']
for addr in sample_addresses:
g = Geocoder('YahooDemo', addr)
print '-'*80
print g
All you need to use this is a Yahoo application id.
You now have four different ways to geocode your company's vital address. If you have suggestions or improvements, let us know. This code is public domain.
Restoring romance to the sports page
By Chris Gemignani
January 31, 2006
Find more about:
design
python
sparklines
visualization
Why do our sports pages look like this?

Instead of this?
| Eastern Conference | |
| Atlantic | |
| Nets | ![]() |
| 76ers | ![]() |
| Celtics | ![]() |
| Raptors | ![]() |
| Knicks | ![]() |
| Central | |
| Pistons | ![]() |
| Cavaliers | ![]() |
| Bucks | ![]() |
| Pacers | ![]() |
| Bulls | ![]() |
| Southeast | |
| Heat | ![]() |
| Wizards | ![]() |
| Magic | ![]() |
| Hawks | ![]() |
| Bobcats | ![]() |
| Western Conference | |
| Pacific | |
| Suns | ![]() |
| Clippers | ![]() |
| Lakers | ![]() |
| Warriors | ![]() |
| Kings | ![]() |
| Southwest | |
| Spurs | ![]() |
| Mavericks | ![]() |
| Grizzlies | ![]() |
| Hornets | ![]() |
| Rockets | ![]() |
| Northwest | |
| Nuggets | ![]() |
| Timberwolves | ![]() |
| Jazz | ![]() |
| SuperSonics | ![]() |
| Trail Blazers | ![]() |
Those green and red lines are "sparklines"--a term invented, I believe, by Edward Tufte. They are little, word-size graphics that show a trend more quickly and clearly than one could describe it. In this case, each sparkline shows an NBA's team record throughout the season; a green up bar is a win, and a red down bar is a loss.
In less space than a standard standings listing, we see the sustained excellence of the Pistons, the steadiness of the Spurs and Mavericks, the Raptors recovering from their awful start, the wheels falling off the Pacers, the mystery that is the Nets. These large multiples of small graphics recover some of the romance and drama that is a season.
For a really beautiful example of sparklines applied to sports, look to Tufte's professional example here. If you know Python, Grig Gheorghiu has written a simple tool for generating sparklines.
10 comments | Show all comments only the last 5 are shown
Brian Cantoni said:
You might also check out Joe Gregorio's Sparklines work (<a href="http://bitworking.org/projects/sparklines/" rel="nofollow">http://bitworking.org/projects/sparklines/</a>). He created (also in Python) a CGI script / web service, including an interactive demo page where you can create your own.
Ben Finney said:
> Why do our sports pages look like this? [numbers in a table]
> Instead of this? [graphics in a table]
Perhaps because the former is text that is accessible to those without graphical capability, and the latter is restricted to a smaller audience.
Chris said:
Ben,
I just don't believe that some people don't have the capability to understand and appreciate simple infographics. Our brains are specialized pattern recognizers, text is the aberation!
The biggest problem with infographics in the sports page is developing a shared visual language, so that someone who reads USA Today doesn't have to learn something new to browse the sports page in the NY Times.
Ben Finney said:
> I just don’t believe that some people don’t have the capability to understand
> and appreciate simple infographics.
I take it you don't believe visually impaired people exist either?
There are many people who are unable to use the internet in a way that makes graphics meaningful. Some of those people are unable to use *any* graphical information in a meaningful way. Eschewing textual information leaves all those people with no information.
Chris said:
I'll concede that point: visually impaired people do exist!
Given their text-integrated word-like capability, sparklines could potentially be a lot more accessible to the visually impaired than traditional charts (imagine if a sparkline were replaced with a one second audio tone.) I wonder if anyone is working on this?
JimJJewett said:
If they can read the table in a newspaper, they can read the graphic.
Some disadvantages that I notice.
(1) It is harder for someone else to read it to you (or OCR it, or index it, or ...).
(2) The sparkline relies on heavily on color, and color newspaper ink costs more.
(3) You don't have a magical number (like .524) to throw around.
(4) It is harder to display multiple types of information. For example, the sparklines above do not display which games were home/road or in-division, so those percentages are lost.
Chris said:
Thanks Jim.
Jeremiah McNichols raised a lot of similar points in this post: http://thinkingpictures.blogspot.com/2006/07/sparklines-handle-with-care.html.
I don't want people to take the exact sparkline I'm showing too literally: the sparkline could be redesigned to show the home/road data, for instance. Personally, I think disadvantage #3 matters most.
Wayne Frazer said:
As a former sports editor and newspaper publisher, I can almost guarantee that system would never fly mainly for the second reason given by Jim above.
To be able to use spot color without running up astronomical prices, you have to have color running on another page adjacent in the printing process, i.e. 1/8/9/16 in the web printing process. Putting color willy nilly throughout the paper would drive cost through the roof.
Also, space is at a huge premium. While I like the sparkline's ability to convey the momentum of the team, the amount of space it would take to be clearly visible on low-quality newsprint paper would be tremendous, and it doesn't tell any other story than trend.
Pete Jelliffe said:
YOu don't need color to show win/loss, you can simply show up down. But while I like the graphic, it's easier to compare relative records and streaks, you can't quote it. You can't rattle off these stats to friends during a conversation.
I would definitely include summry stats at the end like total win/loss, games back and win %.
Tom Snider-Lotz said:
I love sparklines, and use them at work. But for the sports page, as a fan, I want to know how many games behind my team is, especially as the end of the season approaches. I want to compare numbers across divisions if wild card slots are at stake.
Sparklines would make a great supplement to the table, but not a replacement. Tufte himself makes a case for using tables when the data warrant it.






































2 comments
Toby Murdock said:
really cool.
congrats zach & team. :-)
Dirnov said:
Amazing! Not clear for me, how offen you updating your www.juiceanalytics.com.
Dirnov
said:
Add a comment