Tufte-Style Comparison Chart Generator
By Sal Uryasev
May 6, 2008
Find more about:
tufte
pil
comparison
chart
generator
Last week, we shared a rendition of a Tufte graphic using just a few lines of Nodebox code. As our commenters pointed out, Python is great, but it may not be every business analyst's carnal desire to learn a programming language just to generate some nifty graphs. I spent some time to push Chris's Nodebox rendition into a PIL-based Windows tool that can generate the same sort of comparison graph from an Excel file on the fly.
The result is The Comparison Chart Generator 1.0. The installation instructions are relatively simple. Unzip the zip file, and run comparisionchartgenerator.exe.
Alternatively, we have a new excel chart that creates the same effect using only excel functionality. Download the Excel Tufte Line Chart here.
If you are using the Chart Generator, start with some data in an Excel (xls) or Comma Delimited (csv) format. The data for this graph has to be contained within the first sheet starting with cell A1, as in the following picture.

Select an input file. There are a couple example files bundled with the download.

After selecting a file, you'll be prompted to modify a few of the basic options available for the chart.

Finally, save the result as a jpeg.

Here is the same image found in Tufte's textbook processed using the Comparison Chart Generator. It is generated using the csv example file bundled with the download.

Those of us who have undergone lasik eye-improvement surgery may still prefer the sharp crisp Nodebox results, but for the rest of us, this image looks pretty good. Let us know if this tool is useful. If there is enough of a positive response, we may consider expanding functionality for other fancy Tufte-esque charts.
If you do prefer Nodebox, I have an updated script here. This pushes the script up to 20 lines of code or so, but the extra 9 lines allow the labels to push themselves apart on their own. If you want to look at the source code for the Windows program, you can get it here. I used py2exe to compile it into an executable. The code, however, has not been thoroughly commented or cleaned as of yet, so edit it at your own risk.






21 comments | Show all comments only the last 5 are shown
lucas said:
Keep going, guys! I'm looking forward to seeing other Tufte-esque charts here.
And thanks a lot for the Nodebox, what a amazingly useful piece of software!
Asim said:
sal,
it took me a while to put all the pieces together. "using python...using excel..." but i realised that you may be interested in using resolver one:
http://www.resolversystems.com/products/
(i'm certain you've heard of it before, but let me describe it for the benefit of others)
it integrates a spreadsheet environment with a built in ironpython interpreter. that way, you wouldn't have to mess around with PIL and py2exe.
watch the one minute screencast:
http://www.resolversystems.com/screencasts/resolver-one-in-one/
and download it for free under a non-commercial license. big down side: only for windows (i'm a mac user, and don't enjoy working in a virtualised environment).
hope this is of interest to you, take care.
asim
Bilsko said:
Just tried it out on my Vista machine with Excel 2007 and it works great. Of course, I had to save the file as .xls so compchart could read it (it still baffles me that Microsoft had to go and introduce .xlsx as a file type...)
Rob said:
I just tried to run the .exe. file and got an error: "The specified module could not be found. Loadlibrary (pythondl) failed"
Any idea what this means and (more importantly) how to get around it?
Thx
johnny m said:
Awesome! However, all I get are export errors. But you have inspired me to begin to learn Python.
Traceback (most recent call last):
File "comparisonchartgenerator.py", line 247, in <module>
File "Image.pyc", line 1405, in save
File "JpegImagePlugin.pyc", line 409, in _save
File "ImageFile.pyc", line 493, in _save
IOError: encoder error -2 when writing image file
Madelaine said:
Cool, thanks. I might use this for gene expression data sometime.
derek said:
That's very nice. For extra sharp crispness, can you arrange for the imnage to be saved as GIF or PNG? Generally speaking, JPG is a very bad format to choose for graphs. The compression algorithm, which was designed for photographs with their smooth color gradients and few sharp edges, handles text, lines, and solid blocks, with their uniform fields of few colors, and many sharp edges, very badly, and the file is almost never as compact as a GIF acheives.
The image above shows the characteristic "newsprint smudged by fingers" visual effect of text in jpegs, and the file is 57K. You should find a lossless compression format both sharper in appearance and smaller in size.
Sal said:
I picked JPEG as a default since the PNG format is less known within Windows. Functionality for PNG is already included in the program, but is not obvious. When you are offered to save the file, ignore the *.jpg suggestion, and simply name it "whateveryouwant.png". You will have the output converted into the right format.
The GIF format is also built in if you want to try it out, but for some reason the PIL library that I used has not been creating great-looking GIF images. I would avoid them. The PNG looks very nice though.
derek said:
Thanks. Unfortunately, it may call itself a PNG, but it's still got jpeg artifacts. Also, bizarrely, the pseudo-PNG comes out at 60K compared to the jpeg's 40K.
There's no reason for such a simple graphic to have that kind of bloat. At the risk of tooting my own trumpet, see <a href="http://i146.photobucket.com/albums/r264/del_c/politics-charts/DoDDeaths3.png">this 800x600 graph</a>, which I think packs a fair bit more info into only 13.5K.
(and the <a href="http://i146.photobucket.com/albums/r264/del_c/politics-charts/DoDDeaths2small.png">400x300 thumbnail version</a>, designed to fit into the narrow column of a blog, is a mere 3.9K!)
Chris Gemignani said:
Derek,
We've had a number of problems getting a high quality image out of the Python Imaging Library (PIL). For this application, GIF would be best, but PIL was producing some ugly files.
Those graphics are really nice. Excel, too!
We use ImageMagick in house, but we can't package that in an app. A nice approach when using Excel is to output an image slightly bigger than you need then scale it down slightly with ImageMagick. This gives you anti-aliased lines and text that you don't get by default from Excel. It's what we used to produce the Colbert Bump graphs.
Nick said:
Hi,
This looks great! But for some reason the download link for the source for the windows version does not seem to work - I'd love to study the code, to learn how to use basic python to make my own tufte-esque charts.
Christian said:
Thank you for this post, it looks great! I love Tufte's work and read your blog frequently in Google Reader.
The output file (.png or .jpg) could be of a much wider use if it was a .wmf file, because this would enable me to change the colour of one line or text and make any additions I like with Illustrator. Is it possible to get a .wmf version? That would be fantastic.
Sal said:
Code should be accessible.
Most of the code deals with the GUI interface and with parsing excel/csv files. The actual PIL interaction starts around line 196.
I don't believe that PIL actually supports the wmf format. I am fixing up a presentable version of this sort of graph in Excel to add to the next version of chartchooser (http://chartchooser.juiceanalytics.com/). I'll put up a draft version of that when I have it cleaned up - it should be sufficiently editable to not need Illustrator.
Kasper said:
Great tool. One question: Is there a way to change the number of decimals shown? Currently it seems to show just on decimal, whatever the number format in the xls-spreadsheet.
Sal said:
As promised, I posted an excel chart of the same graph. You can find the link near the top of the page.
Jose Hernandez said:
I have an alternative post on a dynamic Excel bumpchart that combines charts with the cell grid. You can donwload it at http://sites.google.com/a/visual-catalyst.com/info_displays/Home/tufte_example_bumpchart.xls?attredirects=0
This display works for all versions of Excel. I'm working on a how to that describes how you can extend this type of chart.
Christof said:
Excellent work. I'm impressed!
John said:
awesome - using it right now. More Tufte style charting programs please!
Andrew said:
Can you do a chart with more than two columns?
Ahem. said:
I think you're missing the point Edward Tufte was making when he made his original chart. Because he took into consideration that the data was all going in the same direction (down) he was able to design a chart where it was pre-planned that there wouldn't be any x's or crossing lines.
(See http://nymag.com/daily/entertainment/2007/06/edward_tufte_and_the_triumph_o.html)
Edward Tufte would find another solution to the data above.
Travis said:
<quote>
"Because he [Tufe] ttook into consideration that the data was all going in the same direction (down) he was able to design a chart where it was pre-planned that there wouldn't be any x's or crossing lines.</quote>
Not true. Do some googling on Tufte and "bumps chart" or "bumps races" for great examples
said:
Add a comment