Data Storytelling

The 2020 Twitter Election: Explore the 20+ Democratic Candidates

Our goal at Juice is to give everyone the ability to design and share compelling data stories. We're always inspired by the The New York Times information design group (and many other data story authors). We want to bring this kind of data communication to every organization.

However, we sometimes forget to share publicly all the cool stuff we can do. We’re going to fix that, starting with this data story, an exploration of democratic presidential candidates and their influence on Twitter. It was crafted as a passion project by our very own Susan Brake.

She set out to answer a few key questions:

How do the candidates compare in the reach of their Twitter audience?

Who has Twitter momentum?

What are the candidates saying on Twitter that is drawing the most attention?

Give it a try. I expect you’ll learn something and enjoy the journey.

If you like it, keep in mind that we work with organizations of all types — start-ups, non-profits, large enterprises, and the public sector — to help them tell the stories in their data.

Your Data Story Needs More Than Data

Data stories use the techniques of traditional storytelling — narrative flow, context, structure — to guide audiences through data and offer flexibility to find insights relevant to them. Data may be the star, but your data story won’t cohere without a mix of additional ingredients.

There are at least four things that you’ll want to incorporate into your data story that go beyond the data visualizations:

1. Context

The first step in a data story is to set the stage. You want to explain to your readers why they should care about what you’re going to tell them? This is also an opportunity to let your reader customize the data they are seeing to make it more relevant to them. A couple of good examples:

2. Educate your readers

Before plunging your audience into a complex or innovative visualization, you want to take some time and space to explain how that visualization works. Tooltips and gradual animation can help the user absorb how to read to the visualization. Try these examples out:

3. Explanation of insights, notes, help

Data stories shouldn’t create more questions than they answer. In some cases, you may want to be explicit about what meaning a reader should take from a visualization.

4. Actions and recommendations

A data story should lead to action. Make some space to explain what recommended actions your readers might take based on the results.

The difference between visualization and data storytelling

25senior2-articleLarge-v3.jpg

Preet Bharara, former United States Attorney for the Southern District of New York, shared some thoughts on writing his new book "Doing Justice" (from the New York Times Book Review podcast — starts at 21:30). These lessons mirror the challenges we see when we think about the distinction between making good visualizations and crafting data stories.

Anything of length is difficult in a different way.

Lots of restaurants have good appetizers, but to sustain a great meal through the appetizer then the main course then dessert is more difficult.

Lots of people can write an article, but to sustain a book is difficult.

Lots of movies have great opening scenes, but to sustain it for 2 hours in an arc that is paced properly is a much different thing.

I also struggled with figuring out which stories to tell and which stories not to tell...it was too much.

Also the difficulty for me was wanting to write a book that wasn't for lawyers...that is [a book that is] page turning.

That’s a succinct three messages about why data stories are different from making a good visualizations:

  1. It takes a lot more effort to tell the whole story — with a beginning, end, and narrative flow.

  2. Don’t share all your data, just the most important stuff.

  3. Communicating to another data analyst isn’t the goal. You need to be able to communicate to people who don’t have the same foundation of understanding.

Hilburn's Law of Data Intentionality

Hilburn's Law of Data Intentionality identifies the existence of a positive correlation between the intentionality of data collection and and the intentionality of data communication [citation needed].

When a person or organization makes deliberate and purposeful choices about the data gathered, the person or organization tends to place similar weight and effort into the presentation of that data. The negative relationship is also true. Data that is not well-considered or valued is typically presented in ways that show little consideration of a specific message or purpose. 

The following diagram represents this relationship between intentional data collection and intentional data presentation.

Hilburns_Law.png

The four quadrants represented above can be explained as follows:

(A) LOW intentionality of data collection and presentation.

It is common for large volumes of data to be gathered without premeditated thought about how the data will be used. As a result, the presentation of this data often lacks a specific purpose. An example of this scenario is web analytics platforms that gather a broad array of measures about visitors to a website without a focus on a specific hypotheses or behaviors. This general dashboard or analytics tool approach asks the data audience to find their own intentionality within the data.

Web Analytics Dashboard

Web Analytics Dashboard

(B) HIGH intentionality of data collection but LOW intentionality of data presentation.

We must also consider the exceptions that prove the rule. In this quadrant are the researchers who have invested time, money, and effort into gathering valuable data but neglect to carry that effort forward into data presentation to their audience. Some syndicated research studies, for example, are presented as long written reports with appendices of data tables. Healthcare analytics start-ups and data scientists can find themselves in this quadrant when they lack the time, resources, or training to properly communicate their hard-earned insights.

(C) HIGH intentionality of data collection and presentation.

This quadrant represents the scenarios when there is consistent and persistent effort to extract value from data. Data experts consider what data they want to collect, the message they find in their cultivated data, and to who they want to communicate the results. Creators of market or customer surveys, consultants, and analytics businesses often fall into this category.

Data-story.jpg

(D) LOW intentionality of data collection but HIGH intentionality of data presentation.

Finally, this quadrant is another uncommon scenario. Data visualization students and practitioners will sometimes use standard data sets (e.g. US Census data, import-export data) as an easily accessible raw material for elaborate data presentation. 

It is important to note that every situation calls for its own level of effort, intentionality, and purposefulness, so there are legitimate reasons why someone would choose to invest or not invest in either intentional data collection or presentation.

What does 'Bandersnatch' teach us about data storytelling?

netflixbandersnatch-800x358.jpg

“TV of tomorrow is now here.” So says The Guardian. The TV show that brought us into the future is Bandersnatch, the recently released interactive television show from the Black Mirror anthology. Bandersnatch is sort of modern reincarnation of the Choose Your Own Adventure books of your childhood. Some reviewers raved about the new experience:

"This is what Bandersnatch gave me that no other movie had ever been able to. Before you say it, I know it’s just an illusion of choice and I had no real control over what played out on screen, but it still provided me with more influence on a narrative that wasn’t my own than I’d ever had before, and I revelled in the possibilities. To me, Bandersnatch is both a movie and a game and something entirely new. It’s a lesson in human psychology, a thesis on the illusion free will, and one hell of an entertaining few hours all rolled into one. And, perhaps most importantly, it’s only the beginning.” Lauren O’Callaghan, Gamesradar

For me, my intrigue with Bandersnatch relates to its similarities with our Juicebox-created interactive data stories. I thought it was worth examining some of the technical and narrative challenges faced by this TV experiment to find lessons for data storytelling.

The very first lesson: Be careful about using the term “Choose Your Own Adventure.” Netflix is getting sued. The other lessons fall into a couple of categories: 1) functionality for interactive storytelling; 2) understanding the audience needs.

Essential Functionality for Interactive Storytelling

Netflix appreciates the necessity of teaching their audience how to use the interactive experience. From the very beginning of the show, you are confronted with the selection mechanism, even before the show begins. To validate that the audience is learning how it works, your next couple of choices are trivial ones, i.e. which type of cereal or music our character would like. These interactions build a sense of comfort before some of the more dreadful decisions arrive.

bandersnatch_frosties.jpg

In Bandersnatch, as in most analytical exercises, it is possible to make choices that result in a dead-end to your story. For example, when our hero Stefan decides to produce his game with a game company, we quickly learn that this choice won’t result in a realization of his gaming vision. Netflix provides a quick mechanism to teach you that this was a wrong-turn and gets you back to a place in the story where you can pick a better option. Imagine the same careful handholding in an Excel PivotTable: at the moment when you choose an option that creates one of those 200 column nightmares, Excel instantly asks if you’d like to make a better choice. If only.

Perhaps Netflix’s greatest accomplishment is with their underlying technology for Bandersnatch that enables a seamless video experience. As a viewer, you make a choice and the video immediately launches you into the next chapter of the story. The same seamless flow needs to exist in data stories; make a selection, immediately see the results. It is a requirement to make users excited to explore the world you are creating.

The Challenges of Interactive Storytelling

Choose Your Own Adventure-style storytelling comes with its own challenges. The author needs to define many endings. Each permutation needs to find a way to connect — all while still developing characters and themes. In fact, one theme of Bandersnatch is that the complexity of this type of storytelling may drive a person to madness. Here’s a look at the decision-tree behind the show:

Redditer u/alpine’s flow chart of Bandersnatch decision-tree

Redditer u/alpine’s flow chart of Bandersnatch decision-tree

What else do we need to consider in telling stories of this form?

To start, we might consider how different audiences react differently to the injection of choice into their entertainment. For every person who is intrigued by asserting more control over the story, there will be others who are looking for passive entertainment.

"Even the most complex, arresting, emotionally draining show is essentially escapism because all the work is done for us.” …do people want to be the decision makers? It is hard work.

As a viewer, I will wave from the shores of traditional TV, happy to be spoonfed my entertainment and hoping that the young folk are having fun.” — The Guardian

Secondly, we have to ask: Does choice come at the expense of characters, coherence, and clarity of story telling? Bandersnatch struggled to build interesting characters. Traditional stories control the audience’s perception of a character at all times, and therefore can build a foundation of what makes that character work, layer by layer. By ceding control of that process to the audience, the author provides a collection of character “bricks” that haven’t been constructed into anything.

"It rarely deviated from the expected deviance, rarely landed in an unexpected place or – and this was where it most resembled its videogaming ancestry – had energy to spare to make the characters much more than ciphers.” — The Guardian

Finally, the audience of an interactive story has to ask themselves (just as Stefan asks): Are we really in control of our choices, or is there a hidden power that is flipping the switches? Are we only getting the illusion of choice?

Bandersnatch is mostly satire, too, but the "gameplay" jumps around a confusing timeline, making you repeat past scenes with different decisions. How you interact with your therapist, whether you agree to take drugs, and if you manage to open a secret safe, for example, all bring you down different paths and to several Game Over screens. These soft endings then send you back to earlier scenes, so you can choose the "correct" choice to further your progress. You have to do this numerous times to eventually receive a true ending. The concept of "right" and "wrong" choices bothered me and cornered me into decisions I didn't want to pick. — Elise Favis, Game Informer

For TV, this is an evolving entertainment form. In the world of data, we are also creating an evolving communication form. How do we find the balance of choice and flexibility with message? How can we engage and entertain without heaping the burden of authorship on an audience? These are questions we’ll need to continue to explore.

Specificity is the Soul of Data Narrative

The folks in the front of the room stared with a forced intensity at (what must have been) the 23rd straight slide showing data about website performance. Their glazed eyes would have been entirely evident if the speaker wasn’t so intently focused on pointing out the change in bounce rate between August and July. In the back of the room, Brian wasn’t able to summon the energy to care. The gentle hum of laptops, dim lighting, and endless onslaught of data practically begged his mind to wander...

specificity.jpg

Specificity is the soul of narrative

This is a frequently-repeated lesson from John Hodgman's excellent podcast Judge John Hodgman. His fake Internet courtroom demands that its litigants share specific information and stories to bring their arguments to life.

Unfortunately, this lesson is often lost when people use data to communicate. Which is not to confuse detail for specificity. Detail — at least in the data communication context — simply means the access to more and more granular data. Specificity requires something more: delivering information that is familiar to your audience, letting them connect with the subject matter at a more personal level. The data is no longer an abstraction, it is something tangible and real.

How do we deliver more specificity in our data stories? Here are three ideas:

  1. Remind your audience of the people behind the data

  2. Begin with an individual story

  3. Explore individual patterns and behaviors

1. Remind your audience that we are talking about individual people or things.

Data is an imperfect reflection of activity in the real world. You want to find ways to emphasize the connection between real people and the data points shown on the screen. A few examples:

Use icons as a subtle reminder that we are talking about people

Use icons as a subtle reminder that we are talking about people

Use images of people to humanize the data

Use images of people to humanize the data

Use individual components (people) to compose the visualizations. A tradition bar chart is transformed into a stack of the individual units.

Use individual components (people) to compose the visualizations. A tradition bar chart is transformed into a stack of the individual units.

In one memorable meeting, I was demonstrating our workforce analytics solution to a prospective client. I was showing the distribution visualization (above) and was careful to roll over individual people to help explain its meaning. As I was highlighting an employee with 40 years of experience at their company, an executive burst out: “Wait a second, that woman was my elementary school teacher.” The data came to life for him that day.

2. Begin with individual stories before showing the big picture.

One of the all-time best specificity-is-the-soul-of-narrative visualizations is the Gun Deaths visual created by Periscope. Take a moment to experience it.

To create emotional impact from the data, the designer starts this visual by showing one gun death at a time.

To create emotional impact from the data, the designer starts this visual by showing one gun death at a time.

Gradually the animation speeds up until the viewer understands the terrifying weight of the many lives cut short.

Gradually the animation speeds up until the viewer understands the terrifying weight of the many lives cut short.

Your data story may be on a more banal topic, but there are still ways to show the individual stories. What does a prototypical conversion in your sales pipeline look like? What is the financial impact of an individual patient going to an abnormally expensive healthcare provider?

3. Provide your audience with the ability to dive into many individual patterns and behaviors.

One compelling anecdote may hook your reader; the ability to see many stories can provide a powerful tool for analysis.

A long time ago we introduced the concept of customer flashcards — visualizations that tell the story of individual people or things, create a language for reading behavior patterns, and the opportunity to flip through many of these visuals. Finding patterns doesn’t have to be the exclusive domain of machine learning — as humans, we are pretty good at seeing and interpreting patterns ourselves. 

Here’s an example from a project we did to see patterns of online learning. Once we found an effective way to show how students took courses, we quickly identified common behaviors that would have been lost in the typical summarization of data. 

flashcards.png

Data storytelling is still finding its fundamental principles and discovering how effectively impact readers. Bringing specificity into these data stories may just be a bedrock principle that we can adopt from a wise Internet judge.

Education Leaders Embrace Data Storytelling

STAT-conference.png

The Data Storytelling Revolution is coming to the K-12 Education world -- in its own unique way. Two days at the annual National Center for Education Statistics STATS DC Data Conference in Washington DC gave me an up-close view of how education leaders were using data to drive policy and understanding school performance. This insiders view was thanks to an invitation by our partners at the Public Consulting Group, one of the leading education consulting practices in the country.

After attending a handful of presentations and hanging out with industry experts, here are a few of my impressions:

Education leaders have a fresh energy about data visualization and data storytelling.

To start with, the conference was subtitled: “Visualizing the Future of Education through Data”. To back this up, the program featured more than a dozen presentations about how to present data to make an impact. There was good-natured laughing and self-flagellation about poor visualizations, and oooh's and aaah's at good visualizations. There was also a genuine appreciation for how important it is to “bridge the last mile” of data to reach important audiences.

IMG_20180730_112920.jpg

Unsurprisingly, Educators understand the need to reach and teach their data audiences.

For many of the attendees, their most important data audiences (teachers, parents, school administrators) are relative novices when it comes to interpreting data. There was a general appreciation that finding better ways to communicate of their data was paramount. The old ways of delivering long reports and clunky dashboards wasn’t going to suffice. The presenters emphasized “less is more” and the value of well-written explanations. I even ran into a solution vendor committed to building data fluency among teachers.  This sincere sensitivity to the needs of the audience isn’t always so prevalent in other industries.

Data technologies and tools take a backseat to process, people, and politics.

On August 20th and 21st, I’ll see you at the Nashville Analytics Summit. When I do, I bet we’ll be surrounded by vendors and wide-eyed attendees talking about big data, machine learning, and artificial intelligence. Not in the Education world. After the lessons of No Child Left Behind and years of stalled and misguided data initiatives, Education knows that successful use of data starts with:

  1. Getting people to buy-in to the meaning, purpose, and value of the data;

  2. Establishing consistent processes for collecting reliable data;

  3. Navigating the political landmines required to move their projects forward.

The Education industry is more focused on building confidence in data, than in performing high-wire analytical acts.

Education has not yet found the balance between directed data stories and flexible guidance.

I sat in on a presentation by the Education Department where they shared a journalism-style data story that revealed insights about English Learners. There website was the first in a series of public explorations of their treasure-trove of data.

Our_Nation_s_English_Learners.png

On the other extreme, the NCES shared a reporting-building engine for navigating another important data set. On one extreme, a one-off static data story; on the other, a self-service report generation tool. The future is in the middle — purposeful, guided analysis complemented by customization to serve each individual viewer. The Education industry is still finding their way toward this balance.

self-service-bi.jpg

 

Every industry needs to find its own path to better use of data. It was enlightening for me to see how a portion of the K12 Education industry is evolving on this journey.

Data Storytelling: What's Easy and What's Hard

Putting data on a screen is easy. Making it meaningful is so much harder. Gathering a collection of visualizations and calling it a data story is easy (and inaccurate). Making data-driven narrative that influences people...hard.

Here are 25 more lessons we've learned (the hard way) about what's easy and what's hard when it comes to telling data stories:

Easy: Picking a good visualization to answer a data question
Hard: Discovering the core message of your data story that will move your audience to action

Easy: Knowing who is your target audience
Hard: Knowing what motivates your target audience at a personal level by understanding their everyday frustrations and career goals

Easy: Collecting questions your audience wants to answer
Hard: Delivering answers your audience can act on

Easy: Providing flexibility to slice and dice data
Hard: Balancing flexibility with prescriptive guidance to help focus on the most important things

Easy: Labeling visualizations
Hard: Explaining the intent and meaning of visualizations

Easy: Choosing dimensions to show
Hard: Choosing the right metrics to show

Easy: Getting an export of the data you need
Hard: Restructuring data for high-performance analytical queries

Easy: Discovering inconsistencies in your data
Hard: Fixing those inconsistencies

Easy: Designing a data story with a fixed data set
Hard: Designing a data story where the data changes

Easy: Categorical dimensions
Hard: Dates

Easy: Showing data values within expected ranges
Hard: Dealing with null values

Easy: Determining formats for data fields
Hard: Writing a human-readable definition of data fields

Easy: Getting people interested in analytics and visualization
Hard: Getting people to use data regularly in their job

Easy: Picking theme colors
Hard: Using colors judiciously and with meaning

Easy: Setting the context for your story
Hard: Creating intrigue and suspense to move people past the introduction

Easy: Showing selections in a visualization
Hard: Carrying those selections through the duration of the story

Easy: Creating a long, shaggy data story
Hard: Creating a concise, meaningful data story
 
Easy: Adding more data
Hard: Cutting out unnecessary data

Easy: Serving one audience
Hard: Serving multiple audiences to enable new kinds of discussions

Easy: Helping people find insights
Hard: Explaining what to do about those insights

Easy: Explaining data to experts
Hard: Explaining data to novices

Easy: Building a predictive model
Hard: Convincing people they should trust your predictive model

Easy: Visual mock-ups with stubbed-in data
Hard: Visual mock-ups that support real-world data

Easy: Building a visualization tool
Hard: Building a data storytelling tool

What's in a Juicebox: Connected Visuals

The ability of an excel novice (i.e. me) to use a pivot table is basically naught. My ability to manipulate data does not exist, and yet I work for one of the most forward-thinking data presentation companies! Nevermind why I was hired, I quickly learned how to use a Juicebox application because Juicebox is designed with the everyday end user in mind. We have tackled the problem of data delivery to both analytical and non-analytical groups. In this post, I want to chat about one of the features that make that possible: connected slices. What is a slice? A slice is a Juice term for a data visualization within a section of Juicebox application.

Screen Shot 2018-03-14 at 11.57.23 AM.png

I have mentioned before that Narrative Flow is important to Juicebox. Our applications are web-based and users expect to move and navigate from top to bottom, like when interacting with a webpage. Part of that movement from top to bottom in Juicebox means that as the user is making selections within the application, those selections should not only carry down the page but that they should also inform the visuals that follow.

We strive to be the world's best platform for telling data stories and because of that connecting our visuals together is vital. When someone makes a selection in the topmost slice, it places a filter on the data and the selection they make. This filter helps the user narrow down their selection and drill into the data.

Much of the problems with static reports and dashboards is that they only give the user a top-level view of his or her data. Traditional solutions do not provide the ability to drill further to discover what factors could be driving the data. In essence, today's charts, dashboards, reports, and BI solutions give the user a snapshot and not the whole story. 

Curious to see what else is included in Juicebox? Check out some of these posts highlighting other unique features:

Getting unstuck: Give your data a jumpstart

It’s a predicament that we’ve seen many times over: your data is stuck. You’ve tried some reporting through some Excel pivot tables, or you’ve messed around with a Tableau trial, but felt like there wasn’t enough engaging content to get your users excited. Rationalizing why you can’t get your data to be impactful for your business, you think things like, “maybe my users are talking about the data but I just don’t know about it” or “maybe the data isn’t structured in a way that allows for valuable insights to be extracted from it."

If you’re sitting there thinking that your mind is being read by our artificial intelligence, you’re wrong. It's because at Juice we have seen this scenario played out too many times and we’ve made it our mission to make these issues a thing of the past. What you need to do is give your data a jumpstart.

Here’s our suggested plan of action for getting your data unstuck and giving it the jumpstart it needs:

1. Get your data into a readable structure.

  • The first row of your data should always represent the column’s title
  • Columns should contain the same type of values, respectively

  • Each row should represent a case or a single instance within the data and should contain a date of when that data was collected. This means that two different rows in the data can represent the same entity with data collected for it at different points in time.

  • As a consequence of the rule above, the data should include a row identifier column that can be repeated to indicate that different rows of data are representing the same entities.

  • Make yourself a metadata sheet (also commonly known as “data definitions”) that you and other users of the data can refer to.

Here are some simple example data & metadata using the principles above.  

2. Present your data in hierarchical manner catered to specific audiences.

  • Give your audience a call-to-action, let them know why the data is important and why they should care.

  • Begin with presenting high-level key metrics. Think about what the most important numbers are you to your intended audience(s).

  • Give your audience the option to select a few different categories in which to segment and parse-out those important numbers. Doing this will allow your audience to drill-down in the data to get from a high-level to a granular level.

  • Allow your audience to take the data they have drilled down to with them. This could be one row of data out of the thousands they started with at the high-level.  

Here’s an example of this data presentation flow.

3. Engaging your audience in data discussions

  • This one is self-explanatory: talking about the data with others is the best way to squeeze the value from it.

Here’s an example of effective data discussions.

Sounds like a good plan of action, right? If you're still not sure what your next steps should be, we’re here to help.

We’ll work with you to get your data in a structure that makes it valuable, or even create data for you. We’ll build you a data story with that data that helps you and your users understand the data so that you can turn data insights into business actions. We’ll get your users engaged in data discussions and app design feedback so that you know they’re engaged with the data and you know how valuable they perceive the app to be. So drop us a line, we’re here to help.