On misrepresenting data

Quick! What are three things wrong with this graph?

Troubled graph

Easy, right? First, nobody can eat that much broccoli. Second, the graduated shading makes it hard to see where the bar graphs end. Third, what’s going on with those brussel sprouts? Zero pounds of sprouts were consumed, but the bar shows a value.

The brussel sprouts badness is based on Microsofts implementation of databars in the upcoming version of Excel. To quote the Excel 2007 blog:

The answer is that when we were doing usability testing of this area in Excel, we found that users preferred not to see blank data bars, so Excel’s default was set to a 10% minimum width.

Here’s a picture showing Excel 12 databars.

Troubled databars

The databar for 170 shows that the minimum databar size is about 10% of the width of the cell. This makes the 170 value look like it’s about a quarter of 170,000 rather than 1/1000th.

Misrepresenting data by default is like shipping Excel with broken statistical functions—it’s something that should never have been considered. Hopefully these examples help illustrate just how misconcieved this idea is. Its discouraging to hear that this is justified by user preferences—gotta get some principles, guys.

On the other hand, the Excel 2007 team is doing a great job explaining the new features in Excel 2007 on their blog. If you use Excel a lot and expect to move to Excel 2007—one great reason to do so is the new Excel breaks the 65,000 row barrier—then you need to check out their blog. The Excel 2007 team is also reading and reacting to feedback, which is great. They’ve got plenty of time to fix this databar problem before release. In the meantime, the rest of us can use the in-cell graphing to do everything databars can do and more.

While on the topic of Microsoft blogs, if you’re a typography geek (Jules/Jon, I’m talking to you), check out fontblog from the Microsoft typography team which has done excellent work (ClearType, web fonts, Consolas). Interesting, quirky and honest, a good read.

Finally, here’s a picture of the graph with all three problems fixed.

Fixed graph