sparklines

New Year’s Resolution: Tufte and the iPhone

Edward Tufte has produced a illuminating video tour of the user interface of the iPhone. The video illustrates Tufte’s struggles to come to grips with the difference between dynamic screen resolution and the resolution of printed paper. Tufte is prone to grandiose pronouncements, like this one:

All history of improvements in human communication is written in terms of improvements in resolution: to produce, for viewers of evidence, more bits per unit time, and more bits per unit area. Slideware is contrary to that history. Trading in reductions in resolution for user convenience or for pitching may be useful in mass market products or in commercial art, but not for technical communications. The solution is not to rescue slideware design; the solution is to use a different, better, and content-driven presentation method. On this solution, see our thread PowerPoint Does Rocket Science—and Better Techniques for Technical Reports — Tufte Nov 10 2006

Somehow, I don’t think the importance of the Gutenberg Bible related to it showing “more bits per unit area.” Quick, count the “bits per unit area.”

Gutenberg bible courtesy of Wikipedia
Illustrated bible courtesy of Wikipedia

It didn’t take bits per unit area to revolutionize communication in the past and it won’t in the future either. The iPhone is a tremendously engaging information device and points the way forward for information displays. Here’s what the iPhone does well:

Maximize screen real estate: Controls are only visible when needed, fading away gently when you are concentrating on content. Tufte furiously neologizes, calling this “computer information debris.” Control junk is more apt, more terse, more Tuftian.

Direct manipulation: As Tufte says: information is the interface. Filtering and choosing should take place in the context of direct manipulation. A good essay on the possibilities of direct manipulation can be found here.

Fun: Above all, information can be fun and engaging to navigate. Tufte condemns Apple’s stock ticker for having “cartoony” and PowerPoint-like displays and offers an improved version (with 5 digits of precision). Apple’s cheery display offers a more entertaining, usable interface for day-to-day usage.

With our empathy for the day-to-day troubles of the business person seeking insight in data, it’s frustrating listening to Tufte. He is clearly an academic, with academic interests and academic timeframes. As much as his work is respected and inspirational within business circles, he makes little effort to enable his message to be implemented.

Good Tufte: Clutter and overload are not an attribute of information, they are failures of design. If the information is in chaos, don’t start throwing out information, instead fix the design.

Bad Tufte: “…the conclusion of sparkline analysis in Beautiful Evidence, where the idea is to make our data graphics at least operate at the resolution of good typography (say 2400 dpi).” http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msgid=0002NC&topicid=1 *Ed: At least 2400 dpi? Orly?

Mostly right Tufte: “Thus the iPhone got it mostly right.”

Mostly wrong Tufte: “Adobe Illustrator is a big serious program that can do almost anything on the visual field (other than Photoshop an image). Most of my sparkline work was done in Illustrator. Fortunately all graphic designers and graphic design students have the program and know how to use it, so find a colleague who knows about graphic design.” http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msgid=0000Jr&topicid=1&topic=Ask%20E%2eT%2e

It is heartening to see Tufte engage and connect his mental frameworks to our modern, screen-oriented, graphics-accelerated, not graphics-designed world. But the future of information design and interaction belongs to the iPhone, not the printed page.

Excel 2007 and the Lie Factor

“The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.”

Edward Tufte calls violation of this principle the “Lie Factor”. The implementation of in-cell data bars in Microsoft Excel 2007 is a big offender.

Almost a year ago, I was surprised to discover that the Microsoft Excel 2007 development team didn’t understand what zero means. Their implementation of in-cell data bars showed a bar in a cell, even if the cell had a zero or very low value.

Data bars in the Excel 2007 prototype

That was in the Excel 2007 Beta. Things haven’t improved in the current version of Excel 2007. The default setting for data bars in Excel 2007 is to scale to bars so that the smallest bar is based on the smallest value in the selected range and the largest bar is based on the largest value. It still appears that the smallest bar will be no smaller than five or ten percent of the width of the cell. Here’s a sample:

Sample data bars in Excel 2007

So, if you select a range that has values between 600 and 700, the 600 would have a little bitty bar and the 700 would have a full-width bar. Based on the bars, it would look like the 700 is ten to twenty times larger than 600. Outside of Redmond, this is generally regarded as untrue.

What’s more, if you create two sets of data bars side by side, each group of data bars scales itself independently even though they look the same. Take a look at this screenshot:

Sample data bars from two different conditional formats in Excel 2007

Notice the top seven cells have data bars that have one set of scaling and the bottom data bars have a different scaling. However, they look identical, and users should generally expect these bars to have the same scale.

Here are the rules:

  1. Defaults matter! It doesn’t matter that you can do data bars correctly in Excel. The default should be to do it right and it should be hard to do it wrong.
  2. The “right way” to make data bars is to make the length of the data bar directly proportional to the value in the cell. If one cell has a value twice another it should have a bar that is twice as long.
  3. Remove the default gradient shading. The gradient makes it hard to tell where the bar ends, obscuring what you’re trying to show.
  4. Continuous cells with data bars should all use the same scale. Use different colors to indicate ranges that have different scales.

Excel 2007 supports at least twenty-five different combinations of ways of specifying the length of the data bar.

Five different ways of setting data bars

Exactly one of those ways is correct. Base the shortest bar on the number 0. Base the longest bar on the highest value. Turn off the gradient. If you want to see bars based off percentile or some custom formula, then be explicit. Create a new column, create your formula, create bars on that column.

Please, guys, this isn’t rocket science. This is plain common sense. You would not ship Microsoft Word with a glaring bug in the way text renders. You would not ship Excel with a broken statistical function that people use everyday. Delivering deceitful-by-design infographics betrays your central role in democratizing the analysis of data. Until you fix this, in-cell ASCII art still remains the best way to explore data visually.

A disclosure: We do not currently use Excel 2007 at Juice Analytics. This is not due to a high-minded sense of moral outrage but is merely a reflection of our clients’ environments.

And Ye Shall Know Me By What I Do Not Do

Jeremiah McNichols writes about one of our posts on sparklines with a critical eye on the limitations of this idea.

None of this is intended to suggest that Sparklines are not useful; indeed, their overextension may be the natural result of the obvious, instinctual, and dramatic utility that accompanies truly innovative ideas. But I do believe that when the dust settles from this discovery, they will be implemented with greater care.

This is a great point tactfully raised. Any approach, technique, or technology is good on certain problem domains and bad on others. Commercial vendors and sellers of BI tools would like you to believe their product works great in all cases all the time.

Wouldn’t it be great if vendors spent as much time talking about where their product should not be used as where it should be used? I can see the marketing now: "Only a COMPLETE IDIOT would use Microstrategy to solve X", "If you want to use Business Objects to support Y, it would be quicker to just throw your money away because it isn’t going to work."

A Missing Link in Business Analytics (Part 2)

In a previous post, we described how NFL coaches and players use film study as their approach to analysis. We argue that "slicing and dicing" statistics doesn’t help much when deciding on a game plan. Business intelligence tools can explain the size of the problem (how good is the opponent?) and trends (what are their preferred offensive weapons?). These same tools do not, however, provide real perspective on customer behaviors or insights that give your organization data-driven direction.

The question remains: How do we bring the value of film study to business intelligence? The solution we’ve used is inspired in equal parts by Edward Tufte and Malcolm Gladwell.

Deficit Sparkline

Tufte is a well-known expert in presentation of informational graphics. Among numerous concepts, he popularized the idea of sparklines: "data-intense, design-simple, word-sized graphics." Here is a sparkline showing deficit spending from 1983 to 2003: What if there was a way to create data-intensive pictures that represent customer behaviors? They could draw on customer usage of a product, marketing touch-points, service calls, and any number of other relevant interactions. The goal: create a simple representation that quickly shows customer behaviors that matter to your business. Here are a few examples from our work:

Customer Sparkline Examples

These pictures are intriguing, but can they be useful? In his book Blink, Malcolm Gladwell introduces the idea of thin slicing: "the act of relegating the decision-making process to the adaptive unconscious by focusing on a small set of pertinent key variables, as opposed to consciously considering the situations as wholes over much longer periods of time." He explains how people become experts at quickly evaluating the relevant data and arrive at a rapid understanding of a situation.

We want to give business people a sense of their customers in a blink of an eye. To do so, customer sparklines need to be intuitive and easy to learn. Success is the ability to show these pictures to anyone in the organization—from senior executives to front-line customer service reps—and have them grasp what they are seeing with just a few minutes of explanation.

A Missing Link in Business Analytics (Part 1)

Insightful analysis of data is important whether you are in business or sports. However, the approaches used in these arenas couldn’t be more different.

Film Study

Take the NFL, for example. Coaches and players spend hours analyzing film to identify the strengths and weaknesses of opposing teams. As much as any professional league, game planning can make the difference in setting-up a team for success. For players, film study is the mental foundation that lets instinct take over on the field. Given the importance of this raw data analysis, it is worth tapping into my ESPN Sunday morning education (thank you, EA Sports NFL Matchup) to considering the techniques involved:

Get granular: Examine the raw data like where players are positioned, who gets the ball from different formations, what plays are called at different field positions, and even what techniques are used by individual players.

Use your eyes: Rely on your brain’s powerful ability to recognize patterns (or "tendencies" in football-speak). Record these patterns by player, by formation, by down and distance.

Create a common context: Given the volume of data, it is important to focus on the differences while holding everything else constant. For the Tampa Buccaneers (and presumably most NFL teams): "every game and practice session - every step, block, throw, kick, zig and zag - is captured on film from two bird’s-eye views: sideline and end zone. The tapes then are intercut so each play can be seen from both angles."

Group common patterns; highlight anomalies: The building blocks of analysis become the common patterns. How frequently does the team behave in a certain way? Does this occur more often in the red zone? Also, after watching enough film, unusual actions on the field pop-out quickly.

Bottom-up strategy: Finally, use this deep understanding of all the opposing team, players, and patterns as the foundation for the game plan.

This type of approach is a far cry from the types of drill-down analyses most common in businesses. Imagine if a NFL team depended on business-type statistics and OLAP reporting. How valuable would it be to look at:

  • Average yards per carry,
  • Trends in passing versus rushing yards,
  • Distribution of touches by player,
  • Individual player statistics?

This type of "slicing and dicing" of the statistics can show the size of the problem (how good is the opponent?) and trends (what are their preferred offensive weapons?), but it wouldn’t get a coach much closer to figuring out what to do about it.

And that typifies the stuck state of analytics in business today. Business intelligence tools are good at reporting and showing trends. These same tools are not good at understanding customer behaviors or complex processes--the types of understanding that provides a solid foundation for marketing, operations, and even corporate strategy. The tools don’t help with the one question that matters whether you are in business or in sports: what should I do based on the evidence?

In Part 2, we’d like to share one idea of how to get deeper into the data so you can get smarter: an approach we call "customer sparklines."

Sparklines in Excel: Simplicity Itself

Sometimes the best ideas are the simplest ones. Sparklines are little, word-like graphics. A sparkline can shows a single time-series or the occurance of events. The idea is that as you can pick up the gist of the data in the flick of an eye. This lets you say things like:

The New Jersey Nets have been streaky all year while the Boston Celtics have been the picture of consistency--consistent mediocrity.

A note on interpretation: green upward whiskers are wins, red down whiskers are losses. So how can ordinary business folks make these things? Until now, sparklines have been the domain of programmers and graphic artists.

Thankfully, Bissantz, a German company, had an elegant idea. The created a set of special sparkline fonts and an easy to use tool that you can use to build sparklines in Excel using their fonts. The tool looks like this.

Sample sparklines look like this:

It works in Excel and it really is fun and easy.

If you want to learn more about sparklines and see some beautiful examples; the canonical page for sparkline theory and discussion is here. Edward Tufte provides a chapter on sparklines from his newest book followed by a back an forth discussion with practitioners in the field. Lots of great examples!

Restoring romance to the sports page

Why do our sports pages look like this?

Instead of this?

Eastern Conference

Atlantic

Nets

76ers

Celtics

Raptors

Knicks

Central

Pistons

Cavaliers

Bucks

Pacers

Bulls

Southeast

Heat

Wizards

Magic

Hawks

Bobcats

Western Conference

Pacific

Suns

Clippers

Lakers

Warriors

Kings

Southwest

Spurs

Mavericks

Grizzlies

Hornets

Rockets

Northwest

Nuggets

Timberwolves

Jazz

SuperSonics

Trail Blazers

Those green and red lines are "sparklines"--a term invented, I believe, by Edward Tufte. They are little, word-size graphics that show a trend more quickly and clearly than one could describe it. In this case, each sparkline shows an NBA’s team record throughout the season; a green up bar is a win, and a red down bar is a loss.

In less space than a standard standings listing, we see the sustained excellence of the Pistons, the steadiness of the Spurs and Mavericks, the Raptors recovering from their awful start, the wheels falling off the Pacers, the mystery that is the Nets. These large multiples of small graphics recover some of the romance and drama that is a season.

For a really beautiful example of sparklines applied to sports, look to Tufte’s professional example here. If you know Python, Grig Gheorghiu has written a simple tool for generating sparklines.