Delivering Data in Excel: The DTP Framework
By Chris Gemignani
January 27, 2009
Find more about:
excel
dtp
framework
dashboard
Here at Juice we build fewer Excel dashboards than we used to. Excel itself is a decidedly imperfect vessel for any serious development--it's simply too easy to veer off of the disciplined track onto the underbrush.
Even so, Excel remains a playground where we can do surprising things. For instance, check out our Excel lightbox and an Excel tagcloud. We could appropriate everything that you find on the webbiest of Web 2.0 websites and build our Uruk-hai equivalents.
The key to staying on the rails when building Excel tools--either dynamic dashboards or simply to explore data--is discipline. At Juice, we use a methodology that we call "DTP" (Data Tansform Present). The foundation of DTP is the rigorous separation of data from presentation. This is similar to a well-known approach when building computer user interfaces called Model-View-Controller. I'm going to cover some of the key principles and we'll follow up with an example later on the blog.
Data
Data is the raw material of any visualization or report. It needs to be easy to add data or change data without having to change anything else about your dashboard.
We store raw data with dimensions preceding metrics in blocks in separate worksheets. If you want to sound pretentious, you can call this "first PivotTable normal form". Key points:
- Have one worksheet for each data source.
- Call these sheets "Data", or "{Title} Data".
- Place them at the end of your workbook.
- Data is snug to the top left of the spreadsheet. This allows us to use dynamic ranges. Dynamic ranges let you add data and have it automatically incorporated in all PivotTables.
- Ensure that column names are in the first row.
- Place your dimensions before metrics.

Transform
We use PivotTables to transform the data into the structure we need.
- Call these sheets "Transform" or "XXXXXXX Transform".
- Create one sheet for each issue that you are exploring. This doesn't mean that you will only create one PivotTable. You may have multiple PivotTables to support different views or perspectives on an issue.
- Turn on "show items with no data" for row and column dimensions.

- We are seeking predictability, we want to the PivotTable to always be the same size regardless of what the PageField filters are.
- Place all the dimensions that aren't used as rows or columns in the PivotTable as page fields. Every dimension should have a home.
- Set all PivotTables to not store data and refresh on open.

- Set all PivotTables to not store data and refresh on open.
Present
The Presentation page copies data from the Transform page(s) and formats it for display. It also allows users to control what data is being displayed.
- Build a user interface to interact with your data. There are many ways to let people interact with your data, but one of the easiest is to use a PivotTable as your interface. This is described below.
- We use an in-house style guide for graphs that you can see in our Chart Chooser.
- If the Presentation page is likely to be printed, preset the print range.
- When copying data from the transformation page to the presentation page, blank values will come out as zeros. We use a simple formula,
=if('Transform!A2'<>"",'Transform!A2', ""), to ensure that blanks remain blanks.
Using a PivotTable as your interface
A simple way to let people manipulate your data is place a PivotTable containing only PageFields but no data on the presentation sheet. A Visual Basic macro triggered to run whenever the PivotTable changes then pushes out any changes to the master PivotTable to all the PivotTables on your Transform sheet.
Here is the code to make this happen.
This drives our PivotTables in concert and ensures they stay in sync.
That's a basic overview of our DTP technique. You can try a simplified version of DTP here.
We'll be back soon to talk through this example.
Bubble, bubble toil and trouble
By Chris Gemignani
January 15, 2009
Find more about:
visualization
nodebox
bubblechart
concentrate
excel
Recently we wanted to show how Concentrate, our new long-tail search analytics tool, could give you a view of search patterns across travel websites. As political junkies, we were inspired by this chart from our friends at the NY Times.
The first tool we tried, simply on principle, was Excel 2003. As expected, making a NY Times quality bubble chart in Excel 2003 is a hard problem. Here's a draft of how far I got before giving in to label fatigue.

The bubbles themselves aren't tough, but getting the labels right is hard. I'd love to see a solution, so if any reader wants to tackle it eternal fame can be yours. Here is a CSV if you want to try.
Another of the tools we use at Juice is NodeBox, which we used to make this:

Here's the code that made the graph.
The power of a programmatic approach like this is that by changing a line or two, you can get the following. Click for a larger version. Click the text for the code..
With great power comes a great need to exercise restraint. Otherwise you end up like these poor chaps. Must... flex... restraint... muscles...
17 comments | Show all comments only the last 5 are shown
Scott Sanders said:
It's far from a full solution, but the XY Chart Labeler might help you get part of the way there: http://www.appspro.com/Utilities/ChartLabeler.htm - I find it indispensable for my bubble and scatter charts.
Clint said:
Chris, it's easy enough to mimic the NYT chart with your data in Excel. A couple of custom series to represent labels (e.g. site name and terms), some custom formatting and you've got a decent representation. The problem of labels in this case I think is the relative smoothness of the data. the NYT data ranges from 1 to 21 which compared to your sample data which runs 0-100% is a relatively tight grouping so the bubble scales are much more similar. In other words, it was a lot easier for the NYT team to fit labels on their bubbles than it is to do on your data. I'll post a graphic later on my blog but, I think the acceptable solution is to create a size legend series...
Clint said:
Scott, in my attempt, I did use x/y chart labeler (a must-have for excel) but that was for custom row and column labels, I don't think it will really solve for getting the labels onto the buttons.
Chris Gemignani said:
I'm looking forward to seeing your solution, Clint at blog.instantcognition.com. The challenge I found, was labeling the X and Y axes at all, not to mention tastefully. I'll be curious to see what you come up with.
Clint said:
Hey Chris,
I was about half way through writing the blog post and encountered a lovely BSOD. Too tired to start the post over tonight so here is the URL to the screenshot of my solution:
<img src="http://blog.instantcognition.com/wp-content/uploads/2009/01/juice_concentrate_bubble.png" width="350px" height="287px" border="0" alt="Clint's Excel Bubble Chart Solution" title="Clint's Excel Bubble Chart Solution"/>
Clint said:
Wow, didn't think that would actually work.
In any case, to follow up on my original comment about the smoothness of the data...
The area of the bubbles in the NYT graphic scale from 1 to 21 which means the diameter has to scale 21x from smallest to largest. For the concentrate data the diameter scales from .5% to 100% - a 200x increase so it's much harder to make data labels fit into/onto the bubbles and be readable and preserve the integrity of the graphic itself
Clint said:
one last note ... don't do math when you're tired. As the above shows (still true, math is just wrong)
Robert said:
Just wanting to throw SAS' name in the hat, as another software that can do this kind of "bubble table" graph (with a bit of programming).
Here's an example of the output. I'm using gif here, which everyone should be able to easily view. SAS can also output other formats such as svg and 1200-dpi tiff, which make the bubbles look smoother (more "publication quality").
http://robslink.com/SAS/democd33/death_odds.gif
Chris Gemignani said:
Very nice, Robert: It's a good tip in general about exporting. If your graphing solution doesn't output antialiased (i.e. lines look smooth rather than chunky) charts, create a higher resolution version then scale it down. A good, free tool on the Windows side for image manipulation is: http://www.irfanview.com/
Fabrice said:
I just posted an Excel version of the chart based on the UDF I created.
Here : http://sparklines-excel.blogspot.com/
Andy Cotgreave said:
I put this together in Tableau:
http://tinyurl.com/9skr3h
Wasn't too sure what your colour was measuring, so I based it on size.
I tried it with text labels, but the text was too big relative to the bubbles. And I couldn't force the size variance to be any bigger. I would have liked to be able to make the biggest bubble bigger.
However, I think it's okay for 5 minute's work!
Clint said:
Andy,
yes, it's much simpler in Tableau and while I like your color as value indicator, you could just as easily have the values displayed next to the bubbles instead of 'on' them and Tableau keeps it clean.
chip said:
Rob Bovey has an xy chart labeler that may have helped on the original Excel version. I use it a lot and it provides a good degree of flexibility on placement.
http://www.appspro.com/Utilities/Utilities.htm
The labels are not dynamic which is a drawback. It works on other types of charts too.
Andy Cotgreave said:
Hi Clint,
Yes, I did initially add the text. However, in Tableau it somewhat overwhelmend the circles. I did try to format the text to grey and shrink it, but the text only served to confuse things.
Chandoo said:
Hi Chris,
Good stuff...
I have tried the same in excel while keeping the labels right (I guess so). You can take a look at the chart and downloadable excel here: http://chandoo.org/wp/2009/01/19/excel-bubble-chart/
Let me know your comments
David Franta said:
Didn't really find another place to post this, but interesting article posted by Cringely (ZDnet fame) about how JP Morgan mangled a bubble chart recently -
http://blog.cringelysmortgage.com/2009/01/29/whats-wrong-with-wall-street/
Mike Chelen said:
How about using the Google Charts API scatter plot? http://code.google.com/apis/chart/types.html#scatter_plot
It allows variable bubble sizes, and has been used in some similar charts such as http://www.xefer.com/twitter
Add a comment
Decorating data
By Chris Gemignani
November 29, 2008
Find more about:
data
analytics
excel
An early Christmas present has arrived from the DabbleDB team for the 100 million or so of us that have to work with data on a day to day basis.
They’ve created a do-what-I-mean web tool that lets you show how you want data to be restructured and bang! it’s done. Check out the video.
It’s a great idea and a elegant, easy to use interface. There are so many directions I’d love to see them take this tool.
Cleanupdata is a great name, but they’re really giving you better ways to restructure data. This tool won’t help you find and fix errors and anomalies in data. At least not yet.
I also hope they extend cleanupdata to let people automate these data restructuring operations. If only you could apply a cleanup created in cleanupdata.com to 1,000 Excel spreadsheets or to a database table.
If you like this, it’s worth checking out DabbleDB. They have rethought the database with a database/spreadsheet/web forms/visualizer platypus of a tool. It lets your data be pliable in ways that databases don’t allow, while retaining structure that spreadsheets don’t recognize.
Added: Avi Bryant, one of the authors of the cleanupdata.com service notes that the example in the screencast is motivated by this post on cleaning data in Excel. Compare and contrast. I know most people would prefer to avoid ="("&MID(H2,1,3)&") "&MID(H2,4,3)&"-"&MID(H2,7,4) in order to format a phone number.
7 comments | Show all comments only the last 5 are shown
Avi Bryant said:
Yes, allowing this to be automated is definitely in our future plans. We just wanted to get something out quickly for people to start playing with so we could see how well it worked in the wild first.
Minor nitpick: I'm not the sole author of this service, everyone on the Dabble DB team worked on it to some degree.
Tom said:
Looks great, but I'm not sure about the prospect of sharing all of my data. Both from the perspective of sharing it with Dabble as a data processor, but also the need to transmit it over the internet.. less good!
Bjoern said:
I agree with Tom. I'd love this to run locally, but transmitting the data to dabble db and then have it mailed back to me. I don't know about US regulations but in Europe I would get seriously smacked on the head for that, especially for data like social security numbers, names and contact data that was shown in the example video.
Chris Gemignani said:
Thanks for the clarification, Avi. I've updated the text. Congrats to the DabbleDB team. We love it.
Michel Guillet said:
Hi Chris,
I've been using birst, a comparable product, with some success.
Mike Chelen said:
Is the free version always available or only for a 30 day trial? Also, it would be nice if more that 15000 rows were supported, overall looks like a pretty cool site and services :)
Ted said:
Sorry, I much prefer a spreadsheet and formulas. They give you complete control (the code in this product is going to make assumptions that are wrong at times), they are very repeatable (i.e. if I do the same thing every day, I shouldn't need to repeat my actions in a browser and run batch jobs), and you can look back at a later point in time and see what you (or soemone else) did very easily. Maybe MID is complicated, but I don't think so.
Add a comment
Analytics Roundup: TIps for showing, sharing, communicating
By Chris Gemignani
December 6, 2007
Find more about:
Business_Intelligence
analytics
business
charts
excel
google
graphics
graphs
powerpoint
presentation
- Developer's Guide - Google Chart API - Google Code
- Beautiful stuff, particularly the Venn diagram.
- Align Journal - BI Worst Practices
- We often see articles on BI "Best Practices" here is an article telling us what NOT to do.
- flot - Google Code
- Attractive Javascript plotting for jQuery.
- ongoing ยท On Communication
- Interesting blog post about how different forms of communication rank for immediacy, lifespan, and audience reached.
- The Excel Magician: 70+ Excel Tips and Shortcuts to help you make Excel Magic : Codswallop
- SlideShare
- Source for presentation ideas.
Introducing Chart Chooser
By Zach Gemignani
November 20, 2007
Find more about:
charts
excel
powerpoint
tools
Find and Download Great-Looking Excel and PowerPoint Charts
Chart Chooser is an online tool that answers two questions we commonly get:
- What type of chart should I use to show my data?
- How can I make good looking Excel or PowerPoint charts?
Chart Chooser is easy:
- Check the boxes on the left that best describe your objective
- Select the chart that you want to use
- Choose from Excel or PowerPoint downloads to get a formatted chart template
A few notes about Chart Chooser:
- Thanks to Andrew Abela of Extreme Presentations for inspiring Chart Chooser with his “Choosing a Good Chart” post and for working with us to put this tool together.
- We’ve tried to make the charts both Tufte-compliant (i.e. minimal chart-junk) and visually attractive (thanks to Google for the color scheme).
- Feel free to suggest other types of charts that you’d like to see in the Chart Chooser. Send an example to chartchooser@juiceanalytics.com.
- If you’d like a customized version of Chart Chooser for your organization, write us at chartchooser@juiceanalytics.com or call me at 202.251.7750.
23 comments | Show all comments only the last 5 are shown
Clint said:
Not bad for a v1 guys - I especially like the waterfall chart as a funnel visualization. The comparative column and bar charts seem a bit noisy though - might be as simple as using thicker bars and columns, then again might not.
aaron said:
very cool!
why did you exclude the basic line chart from the 'relationship' category but include the two-axis line-column chart?
Chris Gemignani said:
Clint: Sounds like someone's volunteering...
Aaron: In my experience, the basic line charts are used to show the performance a bunch of similar series over time. The line-column is typically used in business to show two aspects of the same thing over time, sort of like showing prices and volumes in stock charts.
MikeW said:
Excellent tool, thanks guys! I've sent it round the office so now everyone can create clear, uncluttered charts. RIP grey background!
grossu said:
Awesome. It would be great if you add Numbers for ac format for uploading.
Tony said:
Guys, great tool much like the Chart Cleaner! I also understand that you aren't going to be able to please everyone.
The good: Excel downloads that show the table templates (specifically for the waterfall chart, which many people have trouble with), color schemes and formating are excellent, interactive selecting is great, inclusion of bullet charts and pushing less chartjunk is a big plus.
Opportunity: I see some downfalls with the options that are presented to me. Much like some of the dashboard tools, it gives me the option of what's available, but does not indicate what is optimal. For example, by selecting Composition, I get everything from bar charts to waterfall charts, to pie charts, to tables. It may be helpful to possibly rank which are the most effective. Why use a pie chart when a bar chart is a better choice. Or, use a stacked bar chart (I find very ineffective) when a line graph is probably better.
I think the objective here is to show what's possible and appeal to the masses. I just question your design when you have previously voiced that some of these break the fundamental rules of data visualization (pie charts).
Jesse Robbins said:
Outstanding work!
Joe said:
This is awesome! Great time saver. Thanks for the great posts and tips.
zaxl said:
Great tool! With Chart Cleaner you open mi mind to a whole new world of graphics. This is an excel-ent addition to my arsenal. Many thanks!
govi said:
Great!
Using it right now for my scorecards!
arun said:
I really appreciate your work... the charts are really cool... I am going to use them at work... thanks guys
Ed O'Loughlin said:
I'm sure it's a great tool, but http://chartchooser.juiceanalytics.com/ makes Firefox on XP fall over.
Stef said:
Hmmm... is it only available for US users? Doesn't work from my place - Switzerland. Neither in Safari nor in Firefox. Just some kind of legal note appears...
Chris Gemignani said:
Stef: The problems you're having relate the DNS propagation. We only set up the chartchooser.juiceanalytics.com address earlier this week and it takes a while for all the far corners of the Internet to know about that address. Not that Switzerland is that far away these days.
If you wait a few days, it should work for you.
Ed: I'll give it a try on Firefox with XP.
Michael Vu said:
thank you for this! it's going to be very, very useful for entrepreneurs and business owners.
Kelly O'Day said:
Nice job!
The line chart uses a legend to identify the 4 data series in the example. Legends add an extra step for chart readers, they have to move their eye back and forth between the lines and the legend. This can interfere with quick, easy chart interpretation.
Why not use series labels to make it easier for the reader? I've modified your line chart file to add a procedure which adds series labels instead of the legend.
You can see it <ahref="http://processtrends.com/toc_chart_doctor.htm#Replace_Legend_with_Series_labels"> here</a>.
Kelly
Kelly O'Day said:
Here's the link again. It didn't work when I used <a href="... <.a>
http://processtrends.com/toc_chart_doctor.htm#Replace_Legend_with_Series_labels
Tony said:
Kelly - Nice job! The link in your post above didn't work, so I just went to your site to find the example. I am a big fan of your changes. I would always opt for series labels versus a legend. They take up less space and make it more visually appealing.
Kelly O'Day said:
This is my 3rd and hopefully final try at getting the link to work.
http://processtrends.com/toc_chart_doctor.htm
Stef said:
Hey there, the website is still not visible from Switzerland! Gush....
Mike said:
Hi guys!
This is fu&%$ awesome! Thank you very much for this!
Tom said:
I love your site and have used several graphs to make myself 'look good' at work. Thanks.
I want to use the Waterfall chart but for the life of me I can not figure out how you remove/hide the color fill from the data points after the first one and leave it in for this one.
Thanks.
Priya said:
Hey thanks for this useful site... I was wondering if there is a write up for different type of charts displayed here, as in what type of data or steps / FAQs etc.
If I am missing something here, let me know
Add a comment
Earlier writing







14 comments | Show all comments only the last 5 are shown
Jperez said:
Hi. I want to share a trick I added to your repertoire. Besides separating the dimension from the metrics, I added some columns of labels built by concatenating two or more dimensions of interest. For example, I could have a "year_month" column with data such like "2008_12". Then I can do a sumif(year_month column, "2008_12", metrics column) and sumarize on one cell all metrics for December.
Bill D said:
I use basically the same construct, but I use the GETPIVOTDATA function with data validation dropdowns and INDEX-MATCH lookups.
I think GETPIVOTDATA can be slightly more flexible, identifies errors more readily, and does not require macros to recalculate automatically. Also, showing all items for all dimensions is not very practical for larger data sets with many dimesions.
JJD@IBM said:
I have developed a similar approach to my Excel report applications, especially when I have multiple data sources, but with the following adaptations:
If I'm not allowed to customize the source data report, I tend to do some row-level transform formulas on the data worksheet to the right of the data itself. It could be as simple as parsing month, date, year from a text field, and calculating the quarter. Or I may be comparing multiple columns using nested IF statements and providing some preliminary "decision" about that data at the row level. When I paste in the updated source data, I just have to copy down that block of cells with formulas to match the length of the data. Dynamic ranges and pivot tables do the rest.
Even if I've done some Row-level Transforming on the Data worksheet(s), I still create a Table-level Transform worksheet with a pivot table to establish the dimensions for my data and perform consolidation -- often that's just subtotaling rows at some meaningful detail level.
Finally, I use a combination of data filters and vlookups/etc to pull the Transform layer onto the Present worksheet instead or macros. I avoid macros because most of my financial report users panic when they open an Excel file and the macro Enable/Disable warning pops up!
paresh said:
On a more philosophical level......
If it is a imperfect tool why waste the time learning it. And if experts are having problems staying off the underbrush the novices/intermediate users have no chance.
I personally feel excel is getting a bad name because users who have not been fully trained try to do too much - I have seen the same thing with SAP users and even developers.
I believe that what excel needs is structured training - intermediate users can do their own analysis but any formal reports should be done by power users and only these reports should be used in the business.
Incidentally if you have minimised the use of excel what are you using instead. What are your views on excel friendly olaps?
Jonah Feld said:
If you trade a dynamic range for Excel 2007's Tables, or write a procedure to resize a static range, you'll get much better performance.
Offset is a volatile function, and using it in a dynamic range makes all dependent formulas (but not those sourced from a PivotTable result, like GETPIVOTDATA) recalculate constantly. With a large workbook, it can get pretty slow.
Sandro said:
Hi there,
Are there members of juiceanalytics on Twitter?
Regards.
Sandro.
Chris Gemignani said:
Sandro: A number of us are on Twitter. You can try chrisgemignani and zachgemignani to get started.
cgraves said:
Long time listener, first time caller. I must say, you guys have created quite a wealth of information around the face of business intelligence. THANK YOU for sharing all of this great information and example templates. This might be a bit of a nitnoid question as you guys are tool agnostic but have you run across Excel Services in your travels. I think you have some really cool charts but they don't travel well to Excel Services and was wondering if you had aligned some of your graphs and graphics to take advantage of that tool set for end user reporting.
lucy said:
waaooo, what a great thing it is !
Jacob said:
I am having trouble getting rid of zeroes from my charts. Where exactly does the formula =if('Transform!A2'<>"",'Transform!A2', "") go?
Thanks in advance
nicholas said:
You say that you don't use Excel that often anymore to create dashboards. What tools do you use or recommend these days to build dashboards?
Zach said:
Nicholas, Most of our dashboards are web applications using Flex and our open-source visualization library JuiceKit (www.juicekit.org).
Patrick said:
Wow - Thanks so much, I love it and this make life with Pivottables so much easier! Goes right into our weekly reports!
One question: I always thought I know Pivottables pretty good - but how do I add Pagefields without Data so that the blue frame does not show up like in the example file? Thanks for your help! I love your tools and have been an avid user of the Chart Cleaner for years now. :-)
shawnify said:
Typo in third paragraph: "DTP" (Data Tansform Present)
said:
Add a comment