1. Skip to navigation
  2. Skip to content
  3. Skip to sidebar

Quick! What are three things wrong with this graph?

Troubled graph

Easy, right? First, nobody can eat that much broccoli. Second, the graduated shading makes it hard to see where the bar graphs end. Third, what’s going on with those brussel sprouts? Zero pounds of sprouts were consumed, but the bar shows a value.

The brussel sprouts badness is based on Microsofts implementation of databars in the upcoming version of Excel. To quote the Excel 2007 blog:

The answer is that when we were doing usability testing of this area in Excel, we found that users preferred not to see blank data bars, so Excel’s default was set to a 10% minimum width.

Here’s a picture showing Excel 12 databars.

Troubled databars

The databar for 170 shows that the minimum databar size is about 10% of the width of the cell. This makes the 170 value look like it’s about a quarter of 170,000 rather than 1/1000th.

Misrepresenting data by default is like shipping Excel with broken statistical functions—it’s something that should never have been considered. Hopefully these examples help illustrate just how misconcieved this idea is. Its discouraging to hear that this is justified by user preferences—gotta get some principles, guys.

On the other hand, the Excel 2007 team is doing a great job explaining the new features in Excel 2007 on their blog. If you use Excel a lot and expect to move to Excel 2007—one great reason to do so is the new Excel breaks the 65,000 row barrier—then you need to check out their blog. The Excel 2007 team is also reading and reacting to feedback, which is great. They’ve got plenty of time to fix this databar problem before release. In the meantime, the rest of us can use the in-cell graphing to do everything databars can do and more.

While on the topic of Microsoft blogs, if you’re a typography geek (Jules/Jon, I’m talking to you), check out fontblog from the Microsoft typography team which has done excellent work (ClearType, web fonts, Consolas). Interesting, quirky and honest, a good read.

Finally, here’s a picture of the graph with all three problems fixed.

Fixed graph

Topics:
  • Andrew

    There’s one last problem with that graph – something is up with the leeks; somehow or another 250 lbs of leeks seem to weigh more than 300 lbs of spinach. Just a nit-picky sort of thing.

  • http://www.emergentchaos.com Adam

    Andrew,

    That’s because they’re assuming that spinach includes some substantial weight of e coli which shouldn’t be shown in the graph.

  • Chris

    Thanks, Andrew. I fixed the leek–or rather, the spinach.

  • Maxwell

    Great article. There’s one more problem with the chart: the order of the vegetables. Ideally, they should be ordered by the amount consumed–or at least in alphabetical order–to make the chart easier to read…

  • Chris

    Maxwell,

    That’s what Zach said, too. But I insist they’re ordered by beta carotene content per unit volume.

  • http://www.dbmforum.nl Robin

    My experience is that, in general, companies do not like to see line graphs representing their profits going downwards. I would therefore like to advice the excel 2007 team that line graphs should always have an upward trend, even when the numbers do not support this trend. This can easily be done by reversing the scale in the case of a declining trend.

    Brilliant move to make users preference leading. Why use common sense?

  • mikael

    It is also poorly labeled (or perhaps, just out of context). What is the period measured? What is the unit of consumer? It could be total weight of veggies consumed in the state of Texas during the month of July, or the average dinner plate in Boulder, Colorado. Without some additional description it is impossible to know if one thousand pounds of broccoli is unreasonable (Hey, man, I just really like broccoli!).

    I am astonished that the excel team would even consider exagerating the scale of small numbers. I am also astonished that some users would actually prefer misrepresented data.

  • Henk

    Well, showing a zero value isnt that strange. The dot and line we use in mathematics have no thickness, and still we show them….. But of course in data representation they distort the message, i.e. the feel for relative values.
    Having said this, it might be good to suggest to Redmond to implement a thin line in a different color to show it is a DELIBERATE zero value, and not simply a forgotten value. It cld be used to discriminate to #NA() and real zero also.
    As I am now in obnoxious mode: Chris, this beta carotene you want to show can be in the use of colours for the bars (hues of yellow/orange for different amounts of beta carotene), so another parameter in the chart. It can be defined in Excel as conditional formatting, so the colour automatically adjusts to the amount of beta carotene inserted in the parameter cell (Hint: how to do this can be a great next blog!). So no need to insist – you can have it both ways. Excel is awesome – don’t we all agree?

  • http://maurus.net/weblog/2006/09/29/the-new-microsoft-excel-misrepresents-data-by-default/ Jonas Maurus’ maurus.net » The new Microsoft Excel misrepresents data by default

    [...] The default graphs were hideous at best, but now, thanks to a focus-group-tested and user-centric decision probably made by marketing drones without a brain: Microsoft Excel deliberately misrepresents data, because it turns out, users didn’t like empty cells in bar-graphs… idiots :-) . Update: to change this non-sensical idiotic default: select the range, open VBE and type “activecell.formatconditions(1).percentmin = 1” in the immediate window Permalink [...]

  • Don Parish

    FYI, The Excel team blog(http://blogs.msdn.com/excel/) is seeking guidance on fixing some of the issues you raised here at: http://blogs.msdn.com/excel/archive/2007/10/01/data-bars-feedback-please.aspxpx

  • http://juiceanalytics.com Chris Gemignani

    Thanks, Don!