Survey Results: Are the Viz-Pundits Really Helping?

A few weeks ago Juice asked our readers to give us a few insights into whether or not we and other info-viz sites are actually helping them and their organizations be more effective at communicating information.

Well, the time has come to take a look at the results (oooh - pins and needles). The survey was way more popular than we expected, receiving well over 500 responses.

We had a few questions that were of the form "select the answer that best describes you" but, for the most part, we focussed on text based answers so that we could try to avoid directing the answers and could demonstrate some non-traditional visualization styles to explore results. As a side note, the open ended answers to the text based questions were truly intriguing to read - hopefully the presentation of the results below will give you a small insight to what we learned.

So, here are the results.


Survey Results

The first section of questions dealt with getting some context about our readers. Since the questions were multiple choice, we're showing the results in traditional bar chart format.

Question 1

In terms of size, which of the following is your company most like?

  • A one man band
  • The Dirty Dozen
  • The University of Rhode Island
  • Microsoft

Q1: Company Size

Question 2

In terms of information presentation expertise, who do you see yourself as?

  • The Excel Chart Wizard incarnate (I'm happy with the quickest route)
  • Harold and the Purple Crayon (I'm pretty good, but not too finicky)
  • A Tufte clone (every chart is carefully and lovingly crafted with intention)

Q2: Expertise

Question 3

If your company were stuck on Gilligan's Island, would you be able to use information presentation to get rescued?

  • No, Gilligan keeps using our Tufte books to prop up the break room table.
  • Maybe. The Skipper rigged up this island beacon system using coconuts, vines, and tiki torches.
  • You betcha! The Professor could build a huge island sized information display that could be seen, understood, and acted upon by the astronauts on the International Space Station.

Q3: Escape from Gilligan's Island

Question 4

What two information sources do you most frequently use for information presentation tips, trends, and best practices?

  • BI Vendor's website (e.g., Business Objects, Tableau, Cognos, etc.)
  • The Dashboard Spy
  • Dashboards by Example
  • FlowingData
  • Infographic News
  • Information Aesthetics
  • Jorge Camoes' Charts
  • Juice Analytics
  • Junk Charts
  • Tufte's web Site
  • Visual Business Intelligence (Stephen Few's site)
  • VizThink
  • Other

Q4: Popular Sites

However, What we really want to know is what sites are most closely related. So we tried looking at them with a phrase net from ManyEyes:

Q4: Phrasenet

( You can experiment with it yourself here. )

This is a great way to demonstrate how sites are "connected". We see a very strong relationship between Juice and the other non-Juice sites, but not a strong relationship between the non-Juice sites, themselves. In retrospect, the question would have been more effective had we asked respondents for their "top three or four" sites (approximately: total number of options ÷ 3).


The next group of questions were crafted to help us understand the problems our users and their organizations are encountering when it comes to presenting information to stakeholders and users. For most of these questions we broke the number one rule in surveys: stay away from text based answers.

Question 5

Using one word for each, list three things that you most frequently find useful from these sources?

Q5: Tag Cloud

( You can experiment with it yourself here. )

This was one of the most useful result sets and clearly shows that people like examples and new ideas for visualizations, followed by tips on how to get it done. (I'm hoping this post meets all of those criteria to some level.)

Question 6

Within your organization, would you say the understanding of information visualization best practices is:

  • Staying the same
  • Improving

Q6: Improving?

Question 7

What one word describes the biggest barrier to improved information presentation at your company?

I selected a Wordle (as opposed to a tag cloud) for questions 7 and 8 because I wanted to see the results in a way that would give me the general feeling of the barriers and benefits - I wanted the answers to spur some sort of emotive response. I think a Wordle does this better than a tag cloud.

Q7: Barriers

( You can experiment with it yourself here. )

Question 8

What one word describes the biggest boon to improved information presentation at your company?

Q8: Benefits

( You can experiment with it yourself here. )

While the "barriers" answers were interesting, there are some real nuggets hidden in these "benefits" results.

Question 9

Finish this sentence: "My company would be oh so much better at information presentation if we just had..."

What we really want to know is what are the patterns and relationships between words. Having said that, the most common words are still interesting to see:

Q9: What would be better?

( You can experiment with it yourself here. )

But, we are really interested in the word patterns. So, we used the Juice search patterns tools Concentrate to identify patterns. The top patterns were

Pattern Count
more X 76
more time X 30
better X 29
X data 15
X time 15
more time to X 14
time X 12
a better X 11
X data. 9
X more time 9
people X 8
more people X 7
more resources X 6
the right X 6
more people who X 5
people who X 5
time to X 5
more time and X 4

Now, if we look at how the "non-common" words relate visually, here's what we get:

Q9: Phrasenet

( You can experiment with it yourself here. )

Question 10

Finish this sentence: "If I were to advise someone on how to best improve your capability to create really useful information presentation solutions, I'd say don't forget..."

Again, it's interesting to see the most commonly used words:

Q10: How to improve

( You can experiment with it yourself here. )

But the most value again comes from looking at the phrase net:

Q10: Phrasenet

( You can experiment with it yourself here. )

Question 11

Finally, we're going to post results on our blog for free download. However, if you want us to notify you when the report is ready, please provide your email address below. (And because we have a large international following, please add your country as well, if you don't mind. Why? 'cuz we're just curious. Thanks!)

So, we're going to show only the countries here, no email addresses (whew!). Let's start with looking at the standard distribution:

Q11: Respondent Countries

And here's the geographic representation from Many Eyes:

Q11: Many Eyes Map

( You can experiment with it yourself here. )

But, having looked at that, I thought it might be a little more interesting to look at the country locations like this (text sized based on number of participants):

Q11: Country Cloud


Additional Insights

And that was all of the questions that were in the survey. However, I thought some of the multiple choice "context" question required just a bit more analysis; there were some questions I still had that weren't yet answered. So, I loaded the data into Tableau's Public version of their application to give a little more analysis flexibility. Here is the dashboard I created to better understand expertise:

What this shows is that organizations that are more capable of responding to tough information presentation challenges have a substantially higher ration of "Tufte Clones".

And this made me wonder how skills basis might be impacting different sizes of companies:

A pretty nice linear correlation between company size and improvement trends, don't you think?


You made it to the end!

This post turned out to be much longer than I wanted it to be, but hopefully you found it interesting and learned a few things about your fellow readers and how to display different kinds of survey responses. If you have other insights you think you see, please comment below! Thanks for participating!

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

1 comment


March 3, 2010
derek said:

Have you considered one spot matrix as an alternative to the two stacked barcharts in "where are the experts?" and "what is the expertise blend in companies?"

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Chart Selection, Art and Science

Choosing the right chart for data presentation isn't easy -- even if you do it for a living. For those with less practice, it may resembles the flash of confusion I experience when my wife asks "Which of these outfits looks best on me?"

"...uhhhhhhh, both?"

And like that answer, there isn't any safety in sitting on the fence.

Wouldn't it be nice if there was a formula for choosing the right chart? The fact that there isn't suggests it is a mix of art and science. There are plenty of examples of people who have taken a crack at this problem:

  • Andrew Abela created a diagram that categorizes chart types.
  • In Stephen Few's book Show Me the Numbers, Chapter 5 provides an overview of graph fundamentals. Bonus: I received the following Graph Selection Matrix (PDF) from Steve.
  • In Stephen Kosslyn's book Graph Design for the Eye and Mind, Chapter 2 is entitled "Choosing a Graph Format"
  • Sanket Nadhani shared this short tutorial which tackles the basic choices.
  • From NC State, a flow diagram  for chart selection
  • An Oracle-financed white paper entitled: "Selecting the Best Graph Based on Data, Tasks, and User Roles" (PDF)
  • BonaVista Systems has an Excel add-in for choosing the right chart.

(If you know of any others, put them in the comments and I'll add to this list.)

While these are all great resources, I thought it could be instructive to walk through a sample chart selection process, starting simple then gradually adding more complex requirements. The focus of this post is on 'wireframing' the correct presentation techniques; in a follow-up we'll replicate these same charts noting best practices with refined aesthetics and layout.

I typically ask four questions in choosing how to present data:

1. What data is important to show? Specifically, which dimensions and metrics need to be shown at the same time.

2. What do I want to emphasize in the data? For example, do I want to compare different values, show relationships, or present changes over time? What story am I trying to tell?

3. What options do I have for displaying this data? Your Excel chart menu is a start, but don't forget options such as tables, sparklines, small multiples, and advanced visualizations like treemaps. Many Eyes' list of visualizations can spark additional ideas.

4. Which option is most effective at communicating the data? Which chart or visualization emphasizes what's important in the most direct and readable way?


Imagine a sales organization where two metrics matter most: activity (as measured by call volume) and sales (as measured by dollars sold). The simplest place to start with this data is to present aggregate performance for those two measures. Even with this most basic situation, you have a few options:

Step 1, All Options

Conclusion:Data doesn't always need visualizing. The common and dreadful example of this mistake is when people use a speedometer-style gauge to show a single number (option 3). It is a lot of work, pixels, and distraction for no user value. In this example, we have just a single data point for each measure and no comparisons (e.g. to goals, to last year's performance, the values against each other), so it's best to keep things clean with option 1.


Next, let's look at options for showing activity and sales data by product. In this case, the emphasis should be on the relative performance of each product.

Step 2, Option 1 Step 2, Option 2 Conclusion: Option 1 is the winner. We prefer a vertical layout of labels (bar chart) to a horizontal (i.e. column chart - not shown) because the labels are more readable and the horizontal layout can suggests a time element in the graph. As has been thoroughly documented, a pie chart doesn't allow you to see differences in values as effectively as a bar chart.


What if we wanted to understand these two metrics by time?

Time needs to be displayed horizontally. We've seen ambitious examples from Trend.ly and Axiis that attempt to break this mold, but they more often confuse than enlighten.

Step 3, Option 1 Step 3, Option 2 Step 3, Option 3 Conclusion:I've backed away from using dual axis charts after experiencing too many situations where people are confused by which line goes with which axis, no matter how clearly labeled. Because the emphasis for the data needs to be the trend over time, I would recommend option 2 over option 3's sparklines.


Now it gets interesting: What if we wanted to understand these two metrics by product and by time?

Step 4, Option 1 Step 4, Option 2 Step 4, Option 3

Conclusion: The best option for this case depends on the importance of clearly communicating the detailed trend for each product. In most cases, the "essence" of the trend is good enough, i.e. Is the trend up? Down? Erratic? Smooth? Under that assumption, option 3 provides a nice comparison of the relative product performance and trend.


A few final observations:

  • Labeling matters. How labels are laid out in a chart can be a big difference in readability. It is almost always better if the label text can be written horizontally and be closely tied to the value (rather than in a disconnected legend).
  • Multiple areas of emphasis. There will be compromises when you need to emphasize two things simultaneously (trend, relative values). Pick which one matters most.
  • Know your options. the more types of charts you know of and understand how to apply, the better set of options you'll be able to come up with.
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

11 comments | Show all comments only the last 5 are shown


February 17, 2010
syntaxfree said:

In the "by product" comparison, side-by-side bars like that are classic optical illusion fodder. As for pie charts, the debate is still up; there are people doing experimental psychology regarding them. A table with #sales and $revenue/#sales ratio would be best. Before you choose your chart, you have to choose your metrics, methinks.

For the "by product and time", I'd consider a lower frequency (monthly instead of daily) and use candlesticks. Candlestick charts are a mainstay of the financial world and there's three centuries of accumulated wisdom on how to trend-spot by eye.


February 18, 2010
James said:

Zach's response to Hadley interests me. Is there a difference between analysis and reporting? And if so, wouldn't the term analytics refer to analysis?


February 18, 2010
Zach said:

James, I'm glad you asked. My view is that analytics covers the full spectrum of reporting to analysis, where those terms are explained as follows: Reporting is used to track and evaluate the performance of an understood process. Analysis helps develop an understanding of new processes, erratic and shifting behaviors. See this blog post: http://www.juiceanalytics.com/writing/business-intelligence-isnt-a-technical-problem/ In large part, the difference is based on flexibility and repeatability. Analysis is about being able to rapidly iterate on views of data to explore and find answers (what Hadley was looking for). For reporting, it is more critical to find and stick to particular views of the data.


February 22, 2010
Jen said:

Sorry for a silly comment on another seriously great post, but ... someone seems like an Archie McPhee fan! ;D


February 24, 2010
Dr House said:

for the dual axis option using a bar graph and a line graph for the secondary axis works well instead of using to line graphs. It's still not effortless to figure out which graph is for which axis but it's much easier to see a trend in the relationship between the two metrics.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





S. Few Renounces Dual-Axis Graphs; Juice Ups Ante

After deep introspection, Stephen Few has determined that graphs with dual-scaled axes are fundamentally flawed. Rather than risk the potential for confusion, he believes that there are superior graphing approaches for situations where related data series have different units or magnitudes. His measured and thorough analysis concludes:

“It is inappropriate to use more than one quantitative scale on a single axis, because, to some degree, this encourages people to compare magnitudes of values between them, but this is meaningless.”

I commend Stephen for the courage to start down this path, but he hasn’t gone far enough. Here at Juice, we must often take controversial positions. You may remember that we were among the first to criticize Microsoft’s “databars”, the first to take on the powerful Dashboard Gauge lobby, and the first to challenge the applicability of Tom Davenport’s “Competing on Analytics” sales machine.

While it is true that the second axis can be deceptive, let’s not let the first axis off without asking some tough questions. It is the confusion—nay, the collusion—of the two that causes trouble—who is to say which is the bad seed? We must ask ourselves, do not axes belong in the “Axis of Evil”?

The problem is broader than Stephen suggests: axes are just the tip of the iceberg when it comes to graphic bling that potentially distract or confuse readers:

Take data labels, for example. They encourage users to consider specific values rather than focusing on relative sizes or placement of graph lines or bars.

Legends draw the reader’s eye away from the central storyline of a graphic.

Gridlines… please don’t waste my time with these flat faux-series. One wouldn’t put pinstripping on a Ferrari.

Place your graph in proper context and titles become redundant.

Minimalism is in. Extraneous graph decoration is out. Look no further than Tufte’s sparkline: no excessive graph decoration there.

sparkline

The world cries out for a new charting aesthetic. One that champions elegance and casts down gaudiness. Let us evoke the pure visual essence of the data. Let us find a pure form to evoke the emotion and hidden meaning of the data. Now is the time for Naked graphs—stripped to the essentials (TM).

Our argument is simple: the visualization of information is the message. The data is but an intermediary form of that visualization. Therefore, any residue from the raw data should be scrubbed from your final graph. Only when you achieve this unadulterated state will the meaning of the graphic burn its way into your consciousness.

Here’s an example of an analysis that casts light on both the relationship of the Fed to hedge funds while simultaneously answering your question about what happened with last month’s sales in the Newark division.

naked analysis

Truly here we see the words of Mark 9:43 made real:

If your hand causes you to stumble, cut it off; it is better for you to enter life crippled, than, having your two hands, to go into hell, into the unquenchable fire.

Gaze in awe, viewers, and find wisdom on this very foolish day.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

8 comments | Show all comments only the last 5 are shown


April 12, 2008
Jeff said:

A S. Few article reference, the ever gratuitous Tufte mention AND a verse from the bible - all in one article, talk about data density...


April 12, 2008
dave said:

I generally agree with your philosophy of minimizing "chart junk" but I think you may be going to the extreme here.

Most consumers of chart information are not analytics professionals and need help to interpret.

- Data labels: my users demand them. They want to know the value. I don’t think they have a negative impact on “… than focusing on relative sizes or placement of graph lines or bars”

- Legends: are you kidding? Really?

- Gridlines: absolutely necessary for bar charts. I won’t speak for you, but gridlines help my brain orient the chart information.

- Titles: Chart titles and axis labels are necessary and not at all distracting.

Is there not a comfy middle ground here?

| ^
| ^
| ^
| ^
| ^
| ^
| ^^
| ^ ^
| ^
_____________________________________

Dave


April 12, 2008
derek said:

Dave, check the date that article was posted :-)


April 13, 2008
dave said:

I'm a moron...

thanks derek!


April 18, 2008
tao said:

The root of the problem with visualization is that you are using an organ that is simply not meant to understand so much. The only solution to fully understanding data is to not visualize at all but use a direct neural implant into the brain that allows you to quickly grasp all aspects of the data. All this visualization and introspection about visualization is just trying to improve the horse buggy when the automobiles are coming.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





The Colbert Bump is Real, Colbert’s Nation Not What He Thinks it is

Stephen Colbert has mentioned that he’s having trouble getting guests during the writer’s stike. We find this puzzling, given the supposed benefits of the Colbert Bump. Does being on the Colbert Show really provide a bump—a critical leap that vaults a writer, or a politician to superstardom?

We know that Colbert isn’t a big fan of “facts,” and only needs his gut to tell him the Colbert Bump is real. At Juice, we let the data decide what’s real or not, so our apologies to Stephen for not taking his word for it. Intrigued, Juice Analytics set out to find out the truth. We gathered data about Amazon sales rank for 20 authors that appeared on his show in recent months. How did those ranks change in the days immediately before and after the authors’ appearance on the show?

Amazon Sales Rank of Colbert Guests

Hmmm, there might be something there but those sales ranks don’t tell us much. Fortunately for Stephen, some “eggheads” have worked out roughly how Amazon sales rank corresponds to actual book sales. We calculated the sales, and normalized the data so that the week prior to appearing on the Colbert Report was equal to 1.0. Here’s a picture.

Projected Sales of Colbert Guests

That looks like a bump, Conan. In fact, being on the Colbert Report increases sales by 10 times on average. That bump doesn't last forever, but, let's face it, what does?

We also wanted to know, what kinds of books are Colbert’s audience going crazy for? After all, Colbert is well known as a rock-solid conservative. He’s tight with the Bush Administration. Even though he debates a few liberal (“pinko”) authors now and then, most of his guests are writers of pop-intellectual studies of the Gladwellian persuasion.

Here are the authors and how we categorized them:

Pinkos: Jessica Valenti, Full Frontal Feminism: A Young Woman’s Guide to Why Feminism Matters, Wesley K. Clark, A Time to Lead: For Duty, Honor and Country, Robert Shrum, No Excuses: Concessions of a Serial Campaigner

‘Publicans: Tom DeLay, No Retreat, No Surrender: One American’s Fight

Pop Essayists: Daniel Gilbert, Stumbling on Happiness, Daniel B. Smith, Muses, Madmen, and Prophets: Rethinking the History, Science, and Meaning of Auditory Hallucination, Michael Gershon, The Second Brain: A Groundbreaking New Understanding of Nervous Disorders of the Stomach and Intestine, John J. Mearsheimer, The Israel Lobby and U.S. Foreign Policy, Thomas L. Friedman, The World Is Flat: A Brief History of the Twenty-first Century, Frank J. Sulloway, Born to Rebel: Birth Order, Family Dynamics, and Creative Lives, Jared Diamond, Guns, Germs, and Steel: The Fates of Human Societies, Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable, Richard Preston, The Wild Trees: A Story of Passion and Daring, Malcolm Gladwell, Blink: The Power of Thinking Without Thinking, Bjorn Lomberg, Cool It: The Skeptical Environmentalist’s Guide to Global Warming, Andrew Keen, The Cult of the Amateur: How Today’s Internet is Killing Our Culture, Michael Wallis, The Lincoln Highway: Coast to Coast from Times Square to the Golden Gate

Popular: Stephen Colbert, I Am America (And So Can You!), John Grisham, Playing For Pizza: A Novel, Tina Brown, The Diana Chronicles

How much of a bump did each of these groups receive?

Colbert Bump by Category of Guests

It’s a shock! Liberals and high-minded eggheads do better than popular or conservative books. I’m not sure if Colbert knows this, but his audience isn’t who he thinks they are.

Here are all the authors and their normalized sales around the time of their appearance on the Colbert Report.

Valenti Clark Shrum DeLay Gilbert Smith Gershon Mearsheimer Friedman Sulloway Diamond Taleb Preston Gladwell Lomberg Keen Wallis Colbert Grisham Brown

This post was a collaborative effort of the entire Juice team. Pete Skomoroch concocted the idea, wrote copy, and found the study linking Amazon Sales Rank to actual sales. Zach data mined. David May whipped up elegant, instant visualizations. Sal Uryasev munged data.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

28 comments | Show all comments only the last 5 are shown


April 28, 2008
mike said:

oops already mentioned, perhaps the suggestion of a control group would be best, comparing a media blitz without Colbert Report to those that appear on the show. it would be difficult to separate out the other factors though, like maybe someone that chooses to go on the CR is also more effective in their other promotions. possibly if there were enough data points, then other effects would be insignificant?? ;)
or maybe find someone that ONLY goes on the Colbert Report, a clean sample sort of :D


May 30, 2008
Aaron Deyfer said:

great article!
one question: how did you manage to get the historical sales rank data? Did you gather the data "manually" using AWS over time or do you use another service?


March 4, 2009
Pete Skomoroch said:

Aaron,

I described the data gathering process in a post at the Data Wrangling blog: http://www.datawrangling.com/the-colbert-bump-in-amazon-data I used a python script and http://www.titlez.com/welcome.aspx

-Pete


March 12, 2009
John said:

Seems very truthy


August 22, 2009
kw said:

colbert's audience are those who are liberal minded, and a lot of college/uni students... I believe it's quite similar to the left leaning crowd who watches jon stewart's "the daily show." colbert is very much aware of this, hence his choosing of guests that not only please the audience (even notice how often the studio audience cheers the guest) but allow his conservative pundit character (the conservative colbert is only a tv persona) to verbally spar with the guests. if the guests know to play along, the results are usually comical... if they aren't aware of colbert's character, sometimes the interviews just turn awkward.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Analytics Roundup: TIps for showing, sharing, communicating

Developer's Guide - Google Chart API - Google Code
Beautiful stuff, particularly the Venn diagram.

Align Journal - BI Worst Practices
We often see articles on BI "Best Practices" here is an article telling us what NOT to do.

flot - Google Code
Attractive Javascript plotting for jQuery.

ongoing · On Communication
Interesting blog post about how different forms of communication rank for immediacy, lifespan, and audience reached.

The Excel Magician: 70+ Excel Tips and Shortcuts to help you make Excel Magic : Codswallop

SlideShare
Source for presentation ideas.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

0 comments | Add a comment

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment






Earlier writing