1. Skip to navigation
  2. Skip to content
  3. Skip to sidebar

Our Blog

Hey all – we have developed a great relationship with John Stasko, Associate Chair of the School of Interactive Computing program at Georgia Tech and the General Chair of the upcoming IEEE VIS 2013 conference. As we’ve talked with John, our conversations seem to always come around to the need for a tighter connection between academia and industry. As a result, we thought it’d be great to introduce John to our tribe through a guest post. Below are just some of the ways John is working to bring academia and industry together. Enjoy! 


Hello – I’m a professor at Georgia Tech and I’ve been working in the data visualization research area for over 20 years. My friends at Juice asked me to write a short guest blog entry providing perspectives from the academic data visualization community and exploring ways to foster more industry-academia collaboration. I’ve found that we don’t work together often enough, which is too bad because each side has a lot to offer to the other.

I personally have benefited from business collaborations in many ways. Since data visualization research is so problem-driven, industrial interaction provides an excellent way to learn about current problems and data challenges. In my graduate course on information visualization student teams design and implement semester-long data visualization projects. I encourage the teams to seek out real clients with data who want to understand it better. Some of the best projects over the years have resulted from topics suggested by colleagues working in industry. Additionally, I often employ guest lecturers such as the guys at Juice to come and speak with my students and provide their own insights about creating visualization solutions for clients.

I hope that in some ways my class is benefiting industry as well and helping to train the next generation of data visualization practitioners. Students learn about all the different visualization techniques and their particular strengths and limitations. They also get hands-on practice both designing visualizations for a variety of data sets and using current “best practice” tools and systems. The course has become a key piece of the Master’s degree in Human-Computer Interaction here at GT.

Another opportunity for interaction is academic research forums such as conferences and workshops. Coming up this October in Atlanta is IEEE VIS, the premier academic meeting for data visualization research. VIS consists of three conferences: Information Visualization (InfoVis), Visual Analytics Science & Technology (VAST), and Scientific Visualization (SciVis). Last fall, the meeting garnered over 1000 attendees for the first time.  VIS is an excellent forum to learn about the state of the art in data visualization research, see the latest systems from commercial vendors, and just rub elbows with like-minded friends and colleagues.  Recent papers at VIS presented tools such as Many Eyes and D3, introduced techniques such as Wordles and edge bundling, or just pondered topics such as storytelling and evaluation.  And the meeting has much more than just research papers – It also includes numerous workshops, tutorials, panels, and posters. This year for the first time we have added an “Industrial and Government Experiences Track”. This program is designed to highlight real world experiences designing, building, deploying and evaluating data visualizations. The presentation mode for this track will be posters on display throughout the meeting with multiple focused interaction sessions. Each submission should include a 2-page abstract about the project and a draft of the poster. They are due on June 27th.  More details about the track can be found on the meeting home page.

I hope to see many of you at VIS in October here in Atlanta!

Topics:
, , , ,



Is the Score or the Rainbow More Memorable?

A cool afternoon rain was the only thing damper than the spirits of the 12-year-olds who shuffled off the field. With the score still lit up on the wooden scoreboard, the coaches yelled to the boys as they struggled to lift their heads so they might catch a glimpse of a rainbow as it rose from the fence in front of them.

The players of both the winning and losing teams stood there on the wet, steamy grass, frozen in place, in awe of the sight of a rainbow that mystically appeared as if painted on the sky just for them to see. For them, it was an atta boy, pat on the back, a perfect way to wrap up a hard fought double header in which the score had not quite represented the effort that the losing team had given, where the stats failed to tell the tale that brought these two teams together on the hallowed Cooperstown soil.

That’s the thing about numbers. When left to their own devices, they can feel as cold as digits on a lonely scoreboard. They say nothing of the teams who trained for months, played together game after game, relinquished their Saturdays and played nearly perfect seasons just to get to the tournament.

Numbers alone tell us nothing of context. When we have something particularly meaningful to say, images help us share it best. Dashboards and data visualizations bring to life presentations in which we can engage in two-way conversations with our audience making the story around our data more memorable, impactful and effective than any spreadsheet or table of numbers we can put in front of them.

What will your audience remember? The numbers, the final score? Share visually, and they will remember the rainbow and the sunshine that most certainly will follow.

Special thanks to Peter Bielan, my significant other, for inspiring this blog by sharing this photo that he shot during his son’s baseball team pilgrimage to Cooperstown, NY this week.  

Topics:
, ,



Have you noticed that sometimes it’s hard to get your point across? Do you find you’re trying to do the right thing with your information, but the organization just won’t cooperate?

We think this is happening too often in the companies the Juice Community lives and works in. And we want to do something to change it.

We want to better understand if we’re helping you be more effective in your workplace as an information evangelist. To make this possible, we’d like to ask (yea, even beg) you to complete a short 10 question survey about how information presentation is making progress in your company and if you feel alone or supported by the info-viz pundits out there.

But you might ask “what’s in it for me?” Well, to begin with, we’re going to demonstrate how to summarize qualitative survey information. You’ll get some great examples of how to apply non-traditional charting styles to problems within your organization.

However, we can’t do it alone; we need you to complete the survey. And if we don’t get enough respondents, the results won’t lend themselves to what we have planned.

So what are you waiting for? Fill out the survey here and help us help you help us. And what does Gilligan’s Island have to do with information presentation? Well, you’ll just have to take the survey to find out!

Update: The survey is now officially closed. Thanks to all who responded. We’ll have the results out in a few days.

Topics:
, , ,



Seems like everyone’s trying to understand the future with respect to their customers. We see companies like Google, LinkedIn, and Facebook using predictive analytics to predict user user behavior. Even Stephen Few is giving predictive analytics air time, in classic Fewian cut-right-to-the-core style. So, when we recently had the opportunity to get some predictive analytics insights from one of the industry’s thought leaders, we just couldn’t pass it up.

Eric Siegel, Ph.D., is an expert in predictive analytics and data mining, and is the Conference Chair at the Predictive Analytics World 2010 conference. This is the premier predictive analytics conference and is the “business-focused event for predictive analytics professionals, managers and commercial practitioners.” We asked Eric some questions about the trends he’s seeing in this field and wanted to share them with our community.

(Also, don’t miss out on the Predictive Analytics World discount code at the end.)

Juice: BI visualization has certainly started to become more mainstream in the past few years. Where is predictive analytics on this maturation/adoption curve?

Eric: Predictive analytics has crossed the chasm and hit mainstream in many sectors, such as credit scoring for financial institutions, response modeling for large direct mail houses, fraud detection, and others. And it is mature in that most large and many mid-tier businesses have employed it in one way or another, if only in a first-stage fashion. All industry verticals are replete with success stories.

Juice: Would you say predictive analytics is used more for understanding or for action?

Eric: I’ve always had the impression that predictive analytics is employed with action more the central objective than understanding, although understanding is usually also enjoyed, at least as a “side effect.” A predictive model’s scores drive operational decisions for each customer – that’s the action for which it’s designed. But by taking a gander into the rules or patterns embedded in the predictive model, strategic insights are often also gained.

On the other hand, the results of the Predictive Analytics Survey put the two benefits as a near tie. This may be because, while fewer projects put insights ahead of action, those with action first also typically include insights as well (the pertinent survey question was a check-all-that-apply).

Juice: What are some of the best examples you’ve seen of predictive analytics applications that are designed for the “non-analysts”?

Eric: Well, there are two sorts of “action” that can be driven by predictive analytics: decision automation and decision support. In almost all cases of the latter, where staff “in the field” are provided additional information in order to make more informed their decisions – such as customer service agents providing cross-sell offers based in part on system recommendations, or consumer banking branch managers greeting their clients most at risk of churn – it is a non-analyst who “consumes” the predictive scores output by the analytics system.

Juice: How important is real-time to predictive analytics results and resulting actions?

Eric: This depends entirely on the application: what actions or decisions are being driven by predictive scores? So, no knowledge of analytics is required in order to answer this question. The good news is, when the predictive scores output by a predictive model are required in real-time – such as for selecting the optimal ad to serve to a user based on her profile and behavior – predictive models themselves operate quite quickly. They may involve sophisticated math, but they almost never have any iterative/repetitive “loops” in their programming, so they can turn a customer’s data into that customer’s predictive score very very quickly. It is the derivation of the predictive model in the first place, the application of predictive modeling over historical customer data, that may take hours or days, depending on the analytical method and analyst’s process employed; once you have the model, you are ready to fly.

Juice: How does scenario analysis fit into predictive analytics? What are some of the best practices around scenario analysis?

Eric: Predictive analytics generally works at a lower level than standard scenario analysis. It is doing such an analysis at the individual customer level, predicting the probability the customer will exhibit a certain behavior, such as a response, purchase, or defection. So, when considering a prospective predictive analytics initiative, its potential benefits could be put into a scenario analysis. For example, if predictive analytics is to be used to target a retention campaign, its target benefit of decreasing churn by, say, 10% more than current retention efforts could be plugged into a scenario analysis in order to calculate project ROI and gain further traction for the project.

For more information about predictive analytics, see the Predictive Analytics Guide

More information about the upcoming Predictive Analytics World Conference, Feb 16-17 in San Francisco.

(And finally, here’s the 15% off discount code for the upcoming conference: JUICE010.)

Topics:
,



Analyticstime!

Chris Gemignani

If you struggle to legitimize analytics within your organization, you can’t touch this video for a powerful explanation of the impact of analytics:

MC Hammer at the AlwaysOn/STVP Summit at Stanford, “Music Artists Go Entrepreneurial.” Around minute 24:00.

Topics:
, ,



After deep introspection, Stephen Few has determined that graphs with dual-scaled axes are fundamentally flawed. Rather than risk the potential for confusion, he believes that there are superior graphing approaches for situations where related data series have different units or magnitudes. His measured and thorough analysis concludes:

“It is inappropriate to use more than one quantitative scale on a single axis, because, to some degree, this encourages people to compare magnitudes of values between them, but this is meaningless.”

I commend Stephen for the courage to start down this path, but he hasn’t gone far enough. Here at Juice, we must often take controversial positions. You may remember that we were among the first to criticize Microsoft’s “databars”, the first to take on the powerful Dashboard Gauge lobby, and the first to challenge the applicability of Tom Davenport’s “Competing on Analytics” sales machine.

While it is true that the second axis can be deceptive, let’s not let the first axis off without asking some tough questions. It is the confusion—nay, the collusion—of the two that causes trouble—who is to say which is the bad seed? We must ask ourselves, do not axes belong in the “Axis of Evil”?

The problem is broader than Stephen suggests: axes are just the tip of the iceberg when it comes to graphic bling that potentially distract or confuse readers:

Take data labels, for example. They encourage users to consider specific values rather than focusing on relative sizes or placement of graph lines or bars.

Legends draw the reader’s eye away from the central storyline of a graphic.

Gridlines… please don’t waste my time with these flat faux-series. One wouldn’t put pinstripping on a Ferrari.

Place your graph in proper context and titles become redundant.

Minimalism is in. Extraneous graph decoration is out. Look no further than Tufte’s sparkline: no excessive graph decoration there.

sparkline

The world cries out for a new charting aesthetic. One that champions elegance and casts down gaudiness. Let us evoke the pure visual essence of the data. Let us find a pure form to evoke the emotion and hidden meaning of the data. Now is the time for Naked graphs—stripped to the essentials (TM).

Our argument is simple: the visualization of information is the message. The data is but an intermediary form of that visualization. Therefore, any residue from the raw data should be scrubbed from your final graph. Only when you achieve this unadulterated state will the meaning of the graphic burn its way into your consciousness.

Here’s an example of an analysis that casts light on both the relationship of the Fed to hedge funds while simultaneously answering your question about what happened with last month’s sales in the Newark division.

naked analysis

Truly here we see the words of Mark 9:43 made real:

If your hand causes you to stumble, cut it off; it is better for you to enter life crippled, than, having your two hands, to go into hell, into the unquenchable fire.

Gaze in awe, viewers, and find wisdom on this very foolish day.

Topics:
, ,



Stephen Colbert has mentioned that he’s having trouble getting guests during the writer’s stike. We find this puzzling, given the supposed benefits of the Colbert Bump. Does being on the Colbert Show really provide a bump—a critical leap that vaults a writer, or a politician to superstardom?

We know that Colbert isn’t a big fan of “facts,” and only needs his gut to tell him the Colbert Bump is real. At Juice, we let the data decide what’s real or not, so our apologies to Stephen for not taking his word for it. Intrigued, Juice Analytics set out to find out the truth. We gathered data about Amazon sales rank for 20 authors that appeared on his show in recent months. How did those ranks change in the days immediately before and after the authors’ appearance on the show?

Amazon Sales Rank of Colbert Guests

Hmmm, there might be something there but those sales ranks don’t tell us much. Fortunately for Stephen, some “eggheads” have worked out roughly how Amazon sales rank corresponds to actual book sales. We calculated the sales, and normalized the data so that the week prior to appearing on the Colbert Report was equal to 1.0. Here’s a picture.

Projected Sales of Colbert Guests

That looks like a bump, Conan. In fact, being on the Colbert Report increases sales by 10 times on average. That bump doesn’t last forever, but, let’s face it, what does?

We also wanted to know, what kinds of books are Colbert’s audience going crazy for? After all, Colbert is well known as a rock-solid conservative. He’s tight with the Bush Administration. Even though he debates a few liberal (“pinko”) authors now and then, most of his guests are writers of pop-intellectual studies of the Gladwellian persuasion.

Here are the authors and how we categorized them:

Pinkos: Jessica Valenti, Full Frontal Feminism: A Young Woman’s Guide to Why Feminism Matters, Wesley K. Clark, A Time to Lead: For Duty, Honor and Country, Robert Shrum, No Excuses: Concessions of a Serial Campaigner

‘Publicans: Tom DeLay, No Retreat, No Surrender: One American’s Fight

Pop Essayists: Daniel Gilbert, Stumbling on Happiness, Daniel B. Smith, Muses, Madmen, and Prophets: Rethinking the History, Science, and Meaning of Auditory Hallucination, Michael Gershon, The Second Brain: A Groundbreaking New Understanding of Nervous Disorders of the Stomach and Intestine, John J. Mearsheimer, The Israel Lobby and U.S. Foreign Policy, Thomas L. Friedman, The World Is Flat: A Brief History of the Twenty-first Century, Frank J. Sulloway, Born to Rebel: Birth Order, Family Dynamics, and Creative Lives, Jared Diamond, Guns, Germs, and Steel: The Fates of Human Societies, Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable, Richard Preston, The Wild Trees: A Story of Passion and Daring, Malcolm Gladwell, Blink: The Power of Thinking Without Thinking, Bjorn Lomberg, Cool It: The Skeptical Environmentalist’s Guide to Global Warming, Andrew Keen, The Cult of the Amateur: How Today’s Internet is Killing Our Culture, Michael Wallis, The Lincoln Highway: Coast to Coast from Times Square to the Golden Gate

Popular: Stephen Colbert, I Am America (And So Can You!), John Grisham, Playing For Pizza: A Novel, Tina Brown, The Diana Chronicles

How much of a bump did each of these groups receive?

Colbert Bump by Category of Guests

It’s a shock! Liberals and high-minded eggheads do better than popular or conservative books. I’m not sure if Colbert knows this, but his audience isn’t who he thinks they are.

Here are all the authors and their normalized sales around the time of their appearance on the Colbert Report.

Valenti
Clark
Shrum
DeLay
Gilbert
Smith
Gershon
Mearsheimer
Friedman
Sulloway
Diamond
Taleb
Preston
Gladwell
Lomberg
Keen
Wallis
Colbert
Grisham
Brown

This post was a collaborative effort of the entire Juice team. Pete Skomoroch concocted the idea, wrote copy, and found the study linking Amazon Sales Rank to actual sales. Zach data mined. David May whipped up elegant, instant visualizations. Sal Uryasev munged data.

Topics:
, , ,



Tipped over: social influence “tipping point” theory debunked
Gladwell’s model posits that a few hyperconnected “influentials” are the key to the runaway viral spread of fads, fashions, ideas, and behaviors. What turns out to be the deciding factor is not the “influentials,” but the people who are easily influenced.
Information Architects Japan » Blog Archive » Web Trend Map 2008 Beta
Map of the internet using Tokyo area subway as the charting coordinates,
Topics:
,



Edward Tufte has produced a illuminating video tour of the user interface of the iPhone. The video illustrates Tufte’s struggles to come to grips with the difference between dynamic screen resolution and the resolution of printed paper. Tufte is prone to grandiose pronouncements, like this one:

All history of improvements in human communication is written in terms of improvements in resolution: to produce, for viewers of evidence, more bits per unit time, and more bits per unit area. Slideware is contrary to that history. Trading in reductions in resolution for user convenience or for pitching may be useful in mass market products or in commercial art, but not for technical communications. The solution is not to rescue slideware design; the solution is to use a different, better, and content-driven presentation method. On this solution, see our thread PowerPoint Does Rocket Science—and Better Techniques for Technical Reports — Tufte Nov 10 2006

Somehow, I don’t think the importance of the Gutenberg Bible related to it showing “more bits per unit area.” Quick, count the “bits per unit area.”

Gutenberg bible courtesy of Wikipedia

Illustrated bible courtesy of Wikipedia

It didn’t take bits per unit area to revolutionize communication in the past and it won’t in the future either. The iPhone is a tremendously engaging information device and points the way forward for information displays. Here’s what the iPhone does well:

Maximize screen real estate: Controls are only visible when needed, fading away gently when you are concentrating on content. Tufte furiously neologizes, calling this “computer information debris.” Control junk is more apt, more terse, more Tuftian.

Direct manipulation: As Tufte says: information is the interface. Filtering and choosing should take place in the context of direct manipulation. A good essay on the possibilities of direct manipulation can be found here.

Fun: Above all, information can be fun and engaging to navigate. Tufte condemns Apple’s stock ticker for having “cartoony” and PowerPoint-like displays and offers an improved version (with 5 digits of precision). Apple’s cheery display offers a more entertaining, usable interface for day-to-day usage.

With our empathy for the day-to-day troubles of the business person seeking insight in data, it’s frustrating listening to Tufte. He is clearly an academic, with academic interests and academic timeframes. As much as his work is respected and inspirational within business circles, he makes little effort to enable his message to be implemented.

Good Tufte: Clutter and overload are not an attribute of information, they are failures of design. If the information is in chaos, don’t start throwing out information, instead fix the design.

Bad Tufte: “…the conclusion of sparkline analysis in Beautiful Evidence, where the idea is to make our data graphics at least operate at the resolution of good typography (say 2400 dpi).”

http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msgid=0002NC&topicid=1

*Ed: At least 2400 dpi? Orly?

Mostly right Tufte: “Thus the iPhone got it mostly right.”

Mostly wrong Tufte: “Adobe Illustrator is a big serious program that can do almost anything on the visual field (other than Photoshop an image). Most of my sparkline work was done in Illustrator. Fortunately all graphic designers and graphic design students have the program and know how to use it, so find a colleague who knows about graphic design.”

http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msgid=0000Jr&topicid=1&topic=Ask%20E%2eT%2e

It is heartening to see Tufte engage and connect his mental frameworks to our modern, screen-oriented, graphics-accelerated, not graphics-designed world. But the future of information design and interaction belongs to the iPhone, not the printed page.

Topics:
, , ,



Back in April, I sat down with Jon Udell to record a podcast about visualization, lightweight data exploration, and the future of Excel. This was the start of a weekly series of podcasts that Jon has been holding with technology innovators.

Recently, Jon has been posting transcripts of these interviews created using the excellent CastingWords transcription service. This is an interesting experiment. We’ve recently been thinking a key problem with data is not so much accessibility, but usability. Most business folks can access more data than they can handle, but finding insights in the data or transforming the data into a different usable form is really hard. One solution is tools that can take a stream of data and multicast it into multiple different usable forms, like mp3 and the printed word.

Having these transcripts available lets me explore different Chris-centered hypotheses around Jon’s podcasts. For instance, am I funnier than Jon’s average guest? Did Jon Udell talk more than usual during our conversation? Answers: I’m slightly funnier than the average guest (4 laughs during our conversation), but nowhere near as funny as Gary McGraw (19 laughs). Yes, Jon did have a lot to say during our conversation about data visualization—he spoke about 1/3 of the words, which is higher than his average.

Here’s a table showing some information about Jon’s podcasts.

A conversation with… Words spoken (Jon is red)
Frank Martinez about governance and tolerance |||||||||||||||||||||||||||
Andy Singleton about building global teams ||||||||||||||||||||||||||||||||||||
Anders Hejlsberg about the May 06 preview of LINQ |||||||||||||||||||||||||||||
Harnessing collective intelligence: Nathan McFarland and Benjamin Hill |||||||||||||||||||||||||||||||||||||||||||||||||||||
Kingsley Idehen about open source Virtuoso |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Chris Gemignani about data analysis and visualization |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Steve Burbeck about multicellular computing ||||||||||||||||||||||||||||||||||||||||||||||||||||||
Gary McGraw about software security |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Peter Rodgers about the 1060 NetKernel ||||||||||||||||||||||||||||||

The bar charts in this table were created using a variation of the REPR tip that I’ve discussed as a lightweight data exploration method for spreadsheets. The tip involves repeating the bar character “|” a certain number of times.

This approach also works great in HTML. In the chart above, I plot a “|” for each 100 words that a person speaks. The bars are formatted to 8 point Times New Roman and colorized using a span element. A title in the span element also gives you the ability to hover over any bar and see who was speaking.

The table was generated by pulling down the interview transcripts and parsing them in Python. Here’s what the html looks like for one of the bar graphs:

<td style=’font-family: Times New Roman; font-size: 8pt’>

<span style=’color:red’ title=’Jon’>||||||||||</span>

<span style=’color:blue’ title=’Peter’>||||||||||||||||||||</span>

<td>

Topics:



Page 1 of 212