Chart Selection, Art and Science
By Zach Gemignani
February 16, 2010
Find more about:
reporting,
charts
Choosing the right chart for data presentation isn't easy -- even if you do it for a living. For those with less practice, it may resembles the flash of confusion I experience when my wife asks "Which of these outfits looks best on me?"
"...uhhhhhhh, both?"
And like that answer, there isn't any safety in sitting on the fence.
Wouldn't it be nice if there was a formula for choosing the right chart? The fact that there isn't suggests it is a mix of art and science. There are plenty of examples of people who have taken a crack at this problem:
- Andrew Abela created a diagram that categorizes chart types.
- In Stephen Few's book Show Me the Numbers, Chapter 5 provides an overview of graph fundamentals. Bonus: I received the following Graph Selection Matrix (PDF) from Steve.
- In Stephen Kosslyn's book Graph Design for the Eye and Mind, Chapter 2 is entitled "Choosing a Graph Format"
- Sanket Nadhani shared this short tutorial which tackles the basic choices.
- From NC State, a flow diagram for chart selection
- An Oracle-financed white paper entitled: "Selecting the Best Graph Based on Data, Tasks, and User Roles" (PDF)
- BonaVista Systems has an Excel add-in for choosing the right chart.
(If you know of any others, put them in the comments and I'll add to this list.)
While these are all great resources, I thought it could be instructive to walk through a sample chart selection process, starting simple then gradually adding more complex requirements. The focus of this post is on 'wireframing' the correct presentation techniques; in a follow-up we'll replicate these same charts noting best practices with refined aesthetics and layout.
I typically ask four questions in choosing how to present data:
1. What data is important to show? Specifically, which dimensions and metrics need to be shown at the same time.
2. What do I want to emphasize in the data? For example, do I want to compare different values, show relationships, or present changes over time? What story am I trying to tell?
3. What options do I have for displaying this data? Your Excel chart menu is a start, but don't forget options such as tables, sparklines, small multiples, and advanced visualizations like treemaps. Many Eyes' list of visualizations can spark additional ideas.
4. Which option is most effective at communicating the data? Which chart or visualization emphasizes what's important in the most direct and readable way?
Imagine a sales organization where two metrics matter most: activity (as measured by call volume) and sales (as measured by dollars sold). The simplest place to start with this data is to present aggregate performance for those two measures. Even with this most basic situation, you have a few options:

Conclusion:Data doesn't always need visualizing. The common and dreadful example of this mistake is when people use a speedometer-style gauge to show a single number (option 3). It is a lot of work, pixels, and distraction for no user value. In this example, we have just a single data point for each measure and no comparisons (e.g. to goals, to last year's performance, the values against each other), so it's best to keep things clean with option 1.
Next, let's look at options for showing activity and sales data by product. In this case, the emphasis should be on the relative performance of each product.
Conclusion: Option 1 is the winner. We prefer a vertical layout of labels (bar chart) to a horizontal (i.e. column chart - not shown) because the labels are more readable and the horizontal layout can suggests a time element in the graph. As has been thoroughly documented, a pie chart doesn't allow you to see differences in values as effectively as a bar chart.
What if we wanted to understand these two metrics by time?
Time needs to be displayed horizontally. We've seen ambitious examples from Trend.ly and Axiis that attempt to break this mold, but they more often confuse than enlighten.
Conclusion:I've backed away from using dual axis charts after experiencing too many situations where people are confused by which line goes with which axis, no matter how clearly labeled. Because the emphasis for the data needs to be the trend over time, I would recommend option 2 over option 3's sparklines.
Now it gets interesting: What if we wanted to understand these two metrics by product and by time?

Conclusion: The best option for this case depends on the importance of clearly communicating the detailed trend for each product. In most cases, the "essence" of the trend is good enough, i.e. Is the trend up? Down? Erratic? Smooth? Under that assumption, option 3 provides a nice comparison of the relative product performance and trend.
A few final observations:
- Labeling matters. How labels are laid out in a chart can be a big difference in readability. It is almost always better if the label text can be written horizontally and be closely tied to the value (rather than in a disconnected legend).
- Multiple areas of emphasis. There will be compromises when you need to emphasize two things simultaneously (trend, relative values). Pick which one matters most.
- Know your options. the more types of charts you know of and understand how to apply, the better set of options you'll be able to come up with.
Five Features of Effective Filters
By Zach Gemignani
June 5, 2009
Find more about:
interface,
reporting
I've developed a bit of a penchant (obsession?) for decomposing the pieces of analytical applications and framing the good and the bad characteristics. So far I've taken on treemaps, real-time dashboards, alerts, composite measures, success metrics.
Next up the poor, neglected, and taken-for-granted filter. For such a common and essential component, it seems rare that designers take a moment to consider how to make the best possible filtering mechanism. Here are the five elements I consider critical to a good filter selector along with examples from exemplary interface designs.
- Selections
- Impact
- Context
- Persistence
- Short-cuts
Selections
Good filters make it obvious to users what has been selected. That might seem like an obvious necessity but consider what happens when you filter in an Excel list. The filter section, even if it is a single item, is immediately hidden from view.
Jonathan Harris' frequently referenced We Feel Fine visualization offers one of my favorite filtering examples. Notice how the selected items are highlighted and the non-selected items are de-emphasized. The bar at the top clearly shows what has been selected, even after the filter selector is "put away."

Impact
The best filtering mechanisms also give instant feedback about the impact of your filters. This can be as simple as a subtle indicator that the filters are being applied. Even better, as demonstrated in the The New York Times' Rent or Buy site, the graph animates in real-time as filters are applied. This creates a very tangible connection that helps the user understand the impact of the filtering choices.

Context
Filters should provide information around the items being selected. What does it look like? How many are there? Take the simple font selector in Office applications: Isn't it a no brainer that the names of the options are shown in the actual typeface? Here are a couple other fine examples of context:
Click shirt is Bret Victor's brilliant t-shirt design interface. In it, he offers an elegant filter implementation where all the selections show images of what you are about to select.

Elastic lists is one of the most innovative approaches to filtering. The height of individual blocks in the selectable stack shows the frequency of the items, an embedded sparkline shows the trend, and brightness indicates "weight of the metadata value compared to the overall distribution" (a bit too ambitious/confusing, in my view).

Persistence
Given the importance of filters to most information applications, it is surprising how often the interface makes them hard to find. As I mentioned in an earlier post, the failure of many analytical and reporting applications is that "they assume users know precisely what they need before they’ve begun the analysis." Filtering shouldn't be a one shot deal; the functionality should always be accessible.
Kayak, a travel site, integrated the selection filters into the results so users can easily change their trip criteria without having to start a new search.

Short-cuts
Finally, filters should make it easy to apply common selections (All, None) or complex sets (My Saved Filters, Northwest Region).
Moodstream by Getty Images recognizes that users aren't always going to want to configure a bunch of filters individually. The presets wheel solves this problem by offering a series of pre-defined "filter sets."

Finally, I'd be remiss if I didn't mention the sophisticated and powerful filtering functionality delivered in Tableau. In addition to providing filtering by selecting graphs (i.e. in context filtering), the application allows for multiple selector types, wild-carding, conditional filters, top/bottom filters, and on and on. If you want a comprehensive catalog of potential ways to offer filtering, watch the Filter Data video here.
Why Analytical Applications Fail
By Zach Gemignani
July 7, 2008
Find more about:
analytics
reporting
Many analytical applications fail for a simple reason: they assume users know precisely what they need before they’ve begun the analysis. There are cases where this assumption holds and the user has a specific end-point in mind. But more often, users depend on the tool to track down an answer with only a vague idea of where to start. The exploratory analysis that follows can feel like swimming upstream when the application isn’t designed to facilitate the journey.
The source of this mismatch is partly rooted in the technical perspective of database developers. The simplest path to providing data access is to let users fill out a form to define a SQL query. It is a linear mindset that isn’t well-suited to ambiguous problems.
I’d like to offer a couple examples that illustrates the difference between the common, form-based approach and a more dynamic, interactive approach. Then I’ll explain the implicit assumptions behind the different models and why it matters.
At its heart, Travelocity is a travel analysis tool intended to help you find the best flight (or hotel, car rental, package, etc.) given a complex set of parameters. The relative importance of each of these parameters (departure day/time, return day/time, airports, connections, preferred airlines, price, etc.) is a personal preference… but not one that is explicitly or fully known even to the user. For example, it would be hard for me to say exactly how much more I would pay for a non-stop flight or what is the relative value of a more convenient airport versus a more reliable airline. These preferences are hard to understand prior to seeing specific trade-offs.
Travelocity approaches this complex problem in the way that so many analytical problems do: it asks for all your preferences first then offers a static list results for the specified query.

A few things to note about this search results page:
- On a busy web page, “Change Your Search” is not emphasized.
- The “tracker” across the top shows a linear five-step process. The user is expected to flow through this sequence in order.
- Getting results for a new search takes more than ten seconds.
I’ve been a loyal Travelocity user for years, and I don’t want to imply that this site is poorly designed or difficult to use. The problem is more subtle than that.
By way of comparison, let’s take a look at a more recent entrant to the online travel business, Kayak. This site is designed with a different usage model in mind. Kayak starts by asking for the same information as Travelocity, but the results pages is designed to support further analysis:

The biggest difference is the prominent filtering functionality on the left side of the page. The filters allow users to narrow down their original search without leaving the results page (it takes less than a second to view refreshed results after changing a filter—no “run report” button required). In addition, Kayak places more emphasis on the start-over option. The designers of this site did not assume your first search would be enough to get you to the perfect flight option. Finally, notice the different “views” of the data that are available for a given result set. The views help support different types of decisions based on the same search parameters.
Analytical applications for business have similar underlying structures and usage models. The analysis process in Omniture SiteCatalyst, the leading web analytics platform for large sites, offers a typical example:

This application offers lots of functionality, and it feels like featuring functionality is the primary purpose of the start page. If you want to get to useful data rather than view an advertisement for Omniture products and events, you can start by selecting the “Report Builder:”

Now, it is form-filling time. Like Travelocity, the user is expected to choose the precise parameters before they get to see anything. The resulting report requires a 10 second wait, and the result is static. Any additional filtering will require you to run a new report
Now let’s look at how Google Analytics chooses to structure the user experience:

In contrast to SiteCatalyst, Google Analytics shows you results immediately—no defining or configuring a report before you can get started. Similar to Kayak, the application offers a bunch of options on the report results page to refine parameters (e.g. data ranges, metrics, comparisons).
Travelocity and Omniture make a few assumptions common to analytical applications:
- Users can accurately define their need (i.e. they already know what they are looking for).
- Users can precisely define their need (i.e. they know all the relevant parameters).
- Users’ workflow will follow a linear sequence of events. Going back to the beginning is a failure of the process or user.
More effective analytical applications like Kayak and Google Analytics make different assumptions:
- Users have a general question, but do not necessarily know details about what they're looking for.
- Users need to see results before they can ask better, more detailed questions. These feedback loops provide critical learning.
- Users need to get to data as quickly and easily as possible. A screen without data is delayed progress.
- Different views of the data can provide different insights about results.
- Users want the application to keep up with their trains of thought. Speed and responsiveness matter. Here’s a framework from Jakob Nielsen’s blog about response time:
0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.
In my experience, making the right assumptions about user behavior makes all the difference between an application people enjoy and depend on and an application people dread using.
10 comments | Show all comments only the last 5 are shown
matt said:
Thanks for this analysis. It's very useful to feature real world example followed by your pro view on the topic. More on usability for analytical app, please! :)
Galen @ Estately said:
Awesome analysis - I'm not sure I've ever seen it broken down this way. As a non-stop changer of my preferences, the need to constantly tweak my search / preferences makes Kayak and Google Analytics ideal.
Simon White said:
Hello,
Nice to read your article. Complex travel queries are indeed a hard problem, and balancing user's knowledge (they have, after all, had up to 10 years of experience in online flight queries and booking with Travelocity) and a way to convert the less well versed in just what constitutes a flight / hotel booking is the crux of the problem. Equally, the databases behind such queries are highly optimised and Kayak is piggybacking on top of that as an affiliate, rather than a vendor in itself. Affiliates are possibly best placed to try new paradigms and then drive change in the actual vendor sites, and it's useful to note that. You also might want to look at the two separate flexible dates options on Travelocity's site, which you have to activate before searching, but which provide different price comparison options / views. Perhaps a way of switching to those views would be a good addition, but the interface, as you noted, is already quite busy.
As an aside, Omniture have launched a v14 which gives a slightly different approach to reporting, it would be fairer with a current article to look at their latest UI.
-Simon (disclaimer: I work in the Travelocity group)
Anthony said:
I enjoyed you're article very much and agree with it wholeheartedly...
Have you seen QlikView? It's a platform that works in much the way you describe. There are some good examples at http://demo.qlikview.com
FYI, I work there, so I'm biased.
Brian said:
Hi Anthony. How is Qlikview different than the other bi tools like hyperion, cognos, tableau, excel, etc? Going to the website, I saw mention of in-memory processing, 64 bit architecture, etc (like I read in a Gartner (i believe) report) but I don't see how they are incorporated nor what makes QlikView the best option. Thank you
Robbin said:
I have to change you guys into lovers of Farecast (even though it was purchased by MS.) I met the Kayak guys at SES NY and they tried to convince me to do the Kayak thing, but nothing is as flexible as Farecast. When it comes to travel, that is.
Jurgen said:
Very nice article, thanks for writing it.
I personally like the idea of being able to play with the results and explore further. However, I am sympathetic towards the Travelocity site as it is definately less-busy (translated: "less-confusing") to a new user that simply wants to find the best flight prices between two dates. I'm certainly no expert in travel systems, but I have seen many operational implementations fail because of over-complicating stuff. In my view Kayak has too many options on there. If they prioritised a little they could reduce the number of options yet still offer enough flexibility to differentiate from Travelocity and be innovative. All this without possibly confusing anyone in the process.
Zach said:
@Simon: Obviously I have no insight into the backend of this application. That said, the filtering that Kayak is may be / could be done in the browser, requiring no changes to the original data fetching. At least, that is how we've done it previously: fetch a broad result, narrow within the browser. As for SC14, I don't have access to it (and have heard lots of griping from those who do), so I made the assumption it wasn't game changing.
@Robbin: The Kayak and Farecast experiences are freakishly similar to my eye. Perhaps you can point out where they diverge.
@Jurgen: I'd blame the clutter more on the prominently placed ads rather than the filtering/views functionality.
AnalyticsWorks said:
Very informative post. Simple yet insightful nuggets for defining user interaction and usability of any tool. I would like to know what users think of the Pentaho Modrian OLAP tool that allow users to drill down/ drill through on reports. We are using their engine in our product and would like to know if there are things that can be improved for a more complete user experience.
Thanks,
http://www.analyticsworks.com
Mobius View said:
Nice piece. I like the fact that you liken travel search engines to analytics apps, which they indeed are.
Add a comment
Microsoft's Executive Dashboard... Magnifying Glass Required
By Zach Gemignani
June 26, 2008
Find more about:
dashboard
reporting
microsoft
Organizations have a personality, and it bleeds into everything from executive reporting to product offerings. A recent Fortune article entitled Microsoft without Gates offers this wonderful tidbit about Steve Ballmer, CEO of Microsoft:
Even though he never was a serious computer programmer, by all accounts Ballmer is just as good at math as Gates is. He lives and breathes data. “Steve has a computer in his head,” says Bob Muglia, a 20-year company man who heads the Server and Tools division. Ballmer expects his subordinates to be adept in math as well. He distributes 11-by-17 sheets filled with numbers detailing the progress of various operations. The numerals are so small that executives use transparent magnifier rulers to see them. But there are never any columns showing percentage changes. Ballmer believes people ought to do that in their heads. It saves space on the paper for more numbers.
Wow. If it is as bad as the author describes, Ballmer has designed the anti-dashboard.
The Presentation Zen blog offers another great example of organization culture as displayed in business artifacts:

Gates here explaining the Live strategy. A lot of images and a lot of text...Good graphic design guides the viewer and has a clear hierarchy or order so that she knows where to look first, second, and so on. What is the communication priority of this visual? It must be the circle of clip art, but that does not help me much.

Does it get more "Zen" than this? "Visual-Zen Master," Steve Jobs, allows the screen to fade completely empty at appropriate, short moments while he tells his story.
3 comments
superdaz said:
What is the communication priority of the Apple visual? Making things look pretty because they simply do not and will never experience the amount of work put in by companies like Microsoft over such a range of products.
Demerzel said:
What each of the backgrounds really mean:
Bill Gates: I just bought all these industries and copyrighted them.
Steve Jobs: This is the sum of all my life's work--nothing.
Max said:
@superdaz: I don't quite understand what you mean, considering how notorious Apple as a company is for working its employees to the bone. Anyway, I cite an old Unix mantra here to make my point for me: The ideal program does one thing extremely well, and nothing else. I don't think you could say that for either Microsoft or any of its products. What's the fuss about doing lots of things if you don't do any of them particularly well?
Add a comment
Mashing Google Analytics With External Data
By Sal Uryasev
June 9, 2008
Find more about:
googleanalytics
reporting
google
A couple months ago, we put together a Greasemonkey tool that sucked data out of Google Analytics, and after mining it for trend information, integrated it back into the GA interface. This week's tool combines and extends Google Analytics with data from an outside source.
Here is a quick alpha of our Greasemonkey integration of external data reporting into Google Analytics for Kampyle, a "feedback analytics service." Click on the images to zoom in.
Clicking on the 'Kampylize' tab queries the Kampyle site in real-time to populate the standard GA data table.
Our friends at Kampyle run a service that allows website owners to put a feedback button on individual pages of their website. All information submitted by the user is uploaded to a central Kampyle database that compiles the user feedback with web page url and standard internet statistics such as the name of the browser. Website owners can access a server-end service that consists of a reporting site complete with summary data tables, graphs, and charts.
Since both sites are web-based reporting suites segmented in a similar fashion (individual website, date, web browser, etc.), they integrate together naturally. There is a lot of value in placing related data side by side, allowing users to get a more holistic picture of web site performance. If you have other ideas of data sources that would fit neatly with Google Analytics, let us know and we'll consider building the integration.
If you're interested in technical details, continue to Open Juice to see how this is all accomplished...
Earlier writing







12 comments | Show all comments only the last 5 are shown
Hadley said:
I think you've missed the importance of iteration - the first chart you create will never be the most revealing or informative, but it will suggest where to go next (or at least what not to do!). It may take many iterations (and a few bad starts) before you hone in on the best plot for your goal.
Zach said:
Hadley, I think your point is true in an analysis context where you are looking to explore the data. I could have been more explicit that I was talking about reporting.
Andy said:
Edward R. Tufte has written several books on this subject. His "Visual Explanations" and "The Visual Display of Quantitative Information" are excellent. His website is edwardtufte dot com.
Zach said:
I wasn't able to find any thorough attempt to solve for the chart selection question in "The Visual Display of Quantitative Information." I can't speak to "Visual Explanations." He is more focused on the appropriate design of a chart once you have selected one (along with the introduction of charting models like sparklines and small multiples).
Andreas said:
Hi Zach,
We created an Excel add-in that implements exactly what you are asking for: a formula for choosing the right chart:
http://www.bonavistasystems.com/Products_ChartTamer_Overview.html
A dialog box guides you through the charts selection process asking you questions like: What relation ship do you want to show, what do you want to feature.
This PDF explains the concepts behind Chart Tamer:
http://www.bonavistasystems.com/Download2/Chart%20Tamer%20Introduction.pdf
Andreas
Peter O'Neill said:
I faced a similar problem yesterday trying to think how best to visualise activity (visits) and sales data by product. My solution was to display both on a single bar chart with visits and sales each represented as a proportion of the total for that metric. While this didn't give you any absolute numbers, it clearly displayed the most popular products in terms of traffic and sales as well as which products had an unusual skew (high/low sales per visit). I sorted the products alphabetically as there were 20 of them, this appeared to work better.
syntaxfree said:
In the "by product" comparison, side-by-side bars like that are classic optical illusion fodder. As for pie charts, the debate is still up; there are people doing experimental psychology regarding them. A table with #sales and $revenue/#sales ratio would be best. Before you choose your chart, you have to choose your metrics, methinks.
For the "by product and time", I'd consider a lower frequency (monthly instead of daily) and use candlesticks. Candlestick charts are a mainstay of the financial world and there's three centuries of accumulated wisdom on how to trend-spot by eye.
James said:
Zach's response to Hadley interests me. Is there a difference between analysis and reporting? And if so, wouldn't the term analytics refer to analysis?
Zach said:
James, I'm glad you asked. My view is that analytics covers the full spectrum of reporting to analysis, where those terms are explained as follows: Reporting is used to track and evaluate the performance of an understood process. Analysis helps develop an understanding of new processes, erratic and shifting behaviors. See this blog post: http://www.juiceanalytics.com/writing/business-intelligence-isnt-a-technical-problem/ In large part, the difference is based on flexibility and repeatability. Analysis is about being able to rapidly iterate on views of data to explore and find answers (what Hadley was looking for). For reporting, it is more critical to find and stick to particular views of the data.
Jen said:
Sorry for a silly comment on another seriously great post, but ... someone seems like an Archie McPhee fan! ;D
Dr House said:
for the dual axis option using a bar graph and a line graph for the secondary axis works well instead of using to line graphs. It's still not effortless to figure out which graph is for which axis but it's much easier to see a trend in the relationship between the two metrics.
John B said:
I know this post is rather old, but here is a tip that i use on dual axis line charts. Change the font color of the axis values to the color of the line in the chart, I never have a problem anymore with people wondering which axis which line belongs to.
said:
Add a comment