data fluency

2018 Data and Visualization Gift Ideas

We’re continuing our tradition of the annual data gift guide. These are some of our favorite books and gift ideas for the data scientist, designer or analyst in your life.

While you’re here take a look at the Juicebox product page to see what it looks like unwrapped.

Happy Holidays!

Screen Shot 2018-11-20 at 19.53.04.png

New Books We Love

Books we read in 2018

Data Fluency Image.jpg

Classic Data Books

We’re a little biased in this category, but these are the books on our desks that we refer to all the time.

Data Fluency - Thinking about changing how your team or organization works with data?This is the book for you.

Storytelling with Data - This one already feels like a classic. It provides simple, clear guidance on chart usage and storytelling. Hard not to reference it in the midst of a project.

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy - This is the book that keeps us grounded. Despite how much we think data is delicious and fun its serious too.

The Man Who Lied to His Laptop: What We Can Learn About Ourselves from Our Machines - A seminal read on learning about interactions between humans and machines.

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics - Nathan Yau’s book that teaches us something new every time we pick it up.

The Truthful Art: Data, Charts, and Maps for Communication - We love all of Alberto’s books, but this one is our favorite. Wonderful examples throughout the book.

Screen Shot 2018-11-20 at 17.16.50.png

Art & Posters

Infographics, Maps, Data Art & More

Data Viz Game.jpg

Data Nerds

This is a term of affection during the holidays.

10 Screenwriting Lessons for the Aspiring Data Author

The art of data communication is in its infancy. Fortunately we can learn from other forms. Photography, cartoons, literature, painting, poetry, graphic design -- these are all about using language (visual, aural, written, etc.) to capture attention, convey information and ideas, and move an audience in some way. (In fact, helping organizations understand the power of data communication was the goal of our book Data Fluency.)

When I came across John August’s blog post about how to write a scene, I saw parallels with dashboard and visualization design. John is an accomplished screenwriter (Big Fish, Charlie and the Chocolate Factory, Frankenweenie) and popular blogger and podcaster.

His first piece of guidance: “What needs to happen in this scene? ...The question is not, “What could happen?” or “What should happen?” It is only, “What needs to happen?”

This is the critical concept in all of information design. It isn't a question of what data can you show, it is a question of what data you need to show. How do you need to propel your users forward in their role? Give your audience data that they can use to be better at what they do.

Next, he asks the screenwriter: “What’s the worst that would happen if this scene were omitted?...One thing you learn after a few produced movies is that anything that can be cut will be cut, so put your best material into moments that will absolutely be there when it’s done."

Like a movie audience, your audience has a limited attention span (unfortunately the data presentation business has fewer built-in constraints than the movie business). What data can you remove from the report that won't leave decision-makers misguided or confused? In our work, we always ask: What action is someone going to take when they see this data? If there isn't a clear answer, then leaving it out will help the reader focus on things that are more important.

John emphasizes the importance of choosing your setting..."A father-and-son bonding moment at a slaughter house will play differently than the same dialogue at a lawn bowling tournament."

It is no different for considering how information is presented to your audience. Information designers may overlook the different ways for presenting and wrapping context around the data. A daily email report, a printed slide deck, or an interactive dashboard will have very different impacts on your target audience.

"What’s the most surprising thing that could happen in the scene?"

In other words, what options do you have for grabbing the attention of the your audience? Great data visualizations do this by making data emotionally resonant. A couple good examples include The Fallen of WWII and US Gun Deaths (both grim data stories). In a more mundane example, we designed a data app that showed the costs of training programs in hospitals. By putting a dollar figure on this everyday investment, we were able to capture attention in a new way.

"Is this a long scene or a short scene?"

Edit yourself, show less data, and say more. We all have experienced the scourge of the neverending powerpoint deck or Excel report with endless sheets. Extraneous content comes at a high cost.

"Brainstorm three different ways it could begin."

Dashboards seldom consider a beginning or an end. But your audience will, one way or another, find a starting point and explore data in a sequence. Will you help them with this path? I believe it is crucial to offer an obvious place to begin and useful end-points. It is a feature we've baked into the fundamental design of our Juicebox platform

"Play it on the screen in your head."

I love this advice as applied to information design. Imagine your visualizations with different amounts of data, different values, different results and insights. Pretty soon you'll find the weaknesses. This is my first critique of the pretty dashboards designed on Dribbble. The data will never look so pretty as this in real life and the design will become incomprehensible.

Finally, John ends with advice on the writing process: 1. Outline; 2. Write the full scene; 3. Repeat 200 times. He wants screenwriters to start with the bones of the story, fill in the flesh, then iterate — without fear of tearing the whole thing down if it isn’t working.

Every form of communication has its challenges. Films face constraints and audience expectations, and yet have creative breadth in what can be put on the screen. Communicating data also has an interesting challenge for data authors. It takes a rigorous, analytical mind to understand the data and its meaning, but also requires the artistic skills of a screenwriter. It is a rare combination that needs to be taught and cultivated. If you don’t fit in the slim overlap of this Venn diagram, there is more to learn.

4 game changing strategies for information discrimination

We’re pretty excited about the upcoming Women’s World Cup as well as all the soccer (football) games we’ll get in Atlanta and Nashville this Summer.

All these matches made us think how much authoring data for an audience can be like a preparing for a game or a PK (penalty kick). Distractions and extra information are your enemy. As a data author intent on having your audience understand (get) what you’re doing, you need to prioritize what information really matters. Here are some thoughts around keeping focused and having the biggest impact possible on your audience:

1. Find the heart of the  issue - your data product should have a core theme which is based on the essence of the issue. For the sales team the big question might be “How can we generate more leads into our pipeline?” Honing in on that core question can help you eliminate information that isn’t helpful.

2. Ask a better question - “What would you like to know?” might generate a long list of responses. To help narrow down the list, follow that question with “What would you do if you knew this information?” This second step will help you decided what data is actually needed.

3. Push to the appendix - Of course there will still be times when you are required to include all the data people might want to see. Utilizing an appendix can ensure the information is available but doesn’t detract from the data product’s main purpose.

4. Separate reporting from exploration - Reporting and exploration are two separate processes. Know which purpose you are designing for. Just remember, tools designed for reporting should address specific questions and stay on topic. On the other hand, tools designed for exploration or analysis will provide a broader palette for users explore a variety of data.

Staying focused and incorporating these strategies will help you create data solutions that are useful, productive and interesting. After all, isn’t that the goal :-) ?  Enjoy the matches this summer!

Find out more on effective data presentation strategies from our book, Data Fluency.

Get a free excerpt from the book!


Excerpted with permission from the publisher, Wiley, from Data Fluency: Empowering Your Organization with Effective Data Communication by Zach Gemignani, Chris Gemignani, Richard Galentino, Patrick Schuermann.  Copyright © 2014.

Creating Tools for Data-Informed Decisions

Data-based decisions. It is a phrase that has become dull with overuse. It even suggests that choices are made obvious if you have the right data.

Data-based decisions seems to ignore the fact that decisions, for the most part, are still made by people — people who have colleagues and bosses and customers and stakeholders to balance. Results shown in data often run counter to long-held assumptions or exclude important, unmeasured factors. Data-based decisions aren’t make in a vacuum; they are made in our social, political workplaces.

Perhaps we’d be better of focusing on a similar phrase: data-informed discussions. Before a decision can be made, there is a discussion. Data can, and should, influence these discussion. It is left to humans to sort out options, weight outside factors, and evaluate risks that are not evident in the data.

Consider any discussion you’ve had recently that should be informed by data…a political discussion about climate change; optimization of your marketing tactics; education options for your children. Why is it so hard to bring data into these discussions in ways that enlighten and encourage smart decisions?

A few of the common problems include:

  • Misalignment about the nature of the subject, the decision to be made, or the meaning of concepts;
  • Failing to make assumptions explicit;
  • Different conceptual models of how the world works;
  • Different sources of information, leading to conflicting results.

Ideally, these issues would be addressed ahead of a discussion. What if every difficult discussion could be framed by data so the dialogue could focus on the most important things?

A couple years ago we tried to answer that question for one specific discussion that happens all the time. The tool we created — inelegantly called the “Valuation Analyzer” — facilitates discussions between start-up founders and investors. The debate centers around financial projects, value of the company, and the ultimate payback for equity holders. Our tool is intended to get these two parties on the same page.

Here what it looks like:

The Valuation Analyzer provides common and explicit ground rules for estimating the future value of a company. Users can define a small set of assumptions and instantly see how those inputs impact valuation and return on investment.

Each of the items underlined in blue can be adjusted by clicking and dragging left or right (thanks to Bret Victor’s Tangle library for this input mechanism). Most of the important assumptions are in the lower half of the screen: Do you want to calculate valuation based on revenue or EBITDA? What multiple will you use? How will revenue/EBITDA change over the coming years?

valuation-analyzer-detail.jpg

As you make adjustments, the values and visualization updates instantly. As a special treat, we added a “founder ownership” option (see the button under the title) to answer the most urgent question for any start-up founder: what’s my potential financial outcome?

Finally, a scenario can be saved and shared. The save button will keep all the assumptions you’ve entered and create a unique URL for that scenario.

This tool sets the parameters for a productive data conversation. The dynamic sentence across the top explains the meaning of the content. The assumptions are flexible, obvious, and explicitly stated. Two people sitting across a table can focus on the important values that impact equity owners’ outcomes. The discussion can happen in real time and the results can be saved.

It would be impossible to have a data discussion tool for every situation — there are too many unique circumstance. Nevertheless, this tool provides a model for those opportunities when a discussion happens time and again.

At Juice, we’ve found that organizations are so caught up in the race to capture and analyze data that they’re rushing past the most critical component – the end user. The best data in the world is useless if the everyday decision maker can’t understand it and interact with it. We’ve created a solution. Juicebox [www.juiceanalytics.com] delivers a more thoughtful approach to data visualization. We think about data conversations, not just presentations. Instead of just presenting data, we create more ways for people to interact with, socialize, and act on the data.

9 Reasons We Resist Making Data-Driven Decisions

If the goal is more informed decisions, better tools to analyze and present data just scratch the surface of the solution. There are many cultural and personal reasons why people struggle to rely on data to improve their work. Here are nine common barriers to data-driven decisions -- as illustrated by my 6 year-old daughter:

1. Head in the Sand

The truth can be painful, especially if knowing that truth means letting go of long-held assumptions. Analyzing data holds the risk of revealing new insights that are contrary to someone’s experience about how the world works. One symptom of this type of data resistance is described in a Harvard Business Review article about big data and management:

"Too often, we saw executives who spiced up their reports with lots of data that supported decisions they had already made using the traditional HiPPO approach. Only afterward were underlings dispatched to find the numbers that would justify the decision."

2. Aversion to Math

"Twelve years of compulsory education in mathematics leaves us with a populace that is proud to announce they cannot balance their checkbook, when they would never share that they were illiterate. What we are doing—and the way we are doing it—results in an enormous sector of the population that hates mathematics. The current system disenfranchises so many students." -- Teaching Math to People Who Think They Hate It (The Atlantic)

This subsegment of our society is immediately resistant when presented with numbers. Their reaction may have very little to do with the message and everything to do with the medium.

3. Analysis Paralysis

Some people may embrace data-based decisions…a little too much. Because data is often incomplete or insufficient to draw firm conclusions, it can be easy to keep searching and analyzing in hopes of more clarity. 

When is good enough good enough? RJMetrics suggests that “data driven thinkers avoid analysis paralysis by sorting out when it’s worth taking action now, and when it’s better to pause and collect more data.” 

4. Fear

If the decisions are based on data, why am I necessary?

The fear of displacement can animate some people who resist using data. In their mind, they were hired for their experience, expertise, and gut instinct. These people may not appreciate the important synthesis of data and business understanding that is required to make analytics useful. 

In an American Banker entitled Bank CEOs Fear the Data-Driven Decision, an experienced banker explains: “…most bankers got where they were using their ability to 'read' the situation, a relationship, a deal or a market opportunity based on their gut and their personal skills and experience."

5. Uncertainty and Doubt

Inexperienced users of data will often question their own ability to understand what the data means. They wonder if their interpretation is right and how exactly to read data visualizations.

Sometimes these questions are turned outwards. Can I trust what this data is telling me? Do I feel comfortable with the sources of the data? Or most cynically, do I trust the motivations of the person who provided the data?

6. Preference for Stories

Narratives are easily digestible. The lessons are often clear, as are the heros and villians. Audiences love them. In an effort to commandeer a bit of stories’ attraction, the data analytics industry has focused on the concept of data storytelling. Even so, for many executives, telling a story unencumbered by the facts is a more compelling approach than being tied to the data.

7. Unable to Connect the Dots

Data decision-makers need to make the link between the data they see and the actions they can take. Sometimes this is an organizational problem: the data insights are being generated in a data science team while the people at the front-lines are some distance away. Another disconnect may be between the presention of data and the audience’s ability to absorb the message.

8. Impatience

Relying on data can mean taking the time to find the right data, test hypotheses, and evaluate results. In our fast-moving world, who’s got the time to do the analysis before making every decision?

In response to a Quora question 'What are executives' biggest unanswered questions about data in decision making?’, one respondent noted: "Someone once told me they'd rather rely on heuristics because data analysis is laborious, time consuming, expensive, noisy."

Clearly, data doesn’t need to drive every decision, and making smarter decisions will always save time and resources in the long run.

9. Lost in the Weeds

Pablo Picasso said “Computers are useless. They can only give you answers.”

It isn’t hard to find yourself surrounded by numbers from reports and dashboards, and in the process lose a sense of what it all means. The numbers often don’t help you understand what are the right questions, and what you should do with the answers. People can become fixated on the details and lose the ability pull themselves up to a level to appreciate the implications of those details. 

 

These nine problems -- and many more that you may have seen -- are more emotion than technical and depend more on mind-set than skill-set. Overcoming them requires executive leadership, clarity of message in data communications, explicitly linking data to actions, and a collaborative, pro-data environment. These are a few of the topics we explore in our book Data Fluency.

4 Ways Companies Struggle with Data Fluency

Our new book Data Fluency is about the individual skills and organizational capabilities necessary to communicate effectively with data. We are fascinated by the interplay and interdependence between the two. That is, it takes people who are skilled with presenting data to enable the sharing of insights; equally important, data fluency requires a organizational culture that values decision-driven discussions.

In fact, the framework we introduce in our book is one step more complex. We present the distinction between data consumers and data authors. Those who use data to inform their work versus those who’s work it is to inform people with data. This distinction applies both at the individual level and at the organization level, where we consider the data fluent culture (how do people consume and make use of data?) and the data product ecosystem (what capabilities, processes, and tools are in place to produce effective data products?). As a result, we end up with four building blocks that compose a data fluent organization.

However, it is rare to find a companiy that is strong in all four of these quadrants. Chapter 3 of our book identifies some of the common challenges we see as companies stumble in their efforts to make use of their data. Below are four situations we’ve seen in our experience working with dozens of companies trying to build analytics into how they run their business:

Report Proliferation

Reports have a way of multiplying like rabbits. Start with a perfectly useful and important report: a monthly sales report with product enhancements and utilization metrics sent to strategic accounts to make them aware of improvements coming and past usage. Customers see the information and want to know more. The report grows. A missing metric is added along with a detailed breakout. New reports are spawned, but the old ones don’t go away. Someday, somebody might still find them useful. The general thinking is: “If we report on everything, surely the right information will exist somewhere in a report.” Perhaps they’re right, but if no one can find what they need, everyone’s left sorting rabbits.

Balkanized Data

Departments in an organization can easily become independent silos, operating with their own set of norms, conventions, and terminology. This impacts what you can do with your data and what you can understand. You’ve experienced this problem if you’ve ever been on a customer service call where you give all your personal information at the start of the call and then have to give it all again every time you’re transferred. Each organizational department may use different data systems and terminology, processes, and conventions in data conversations and products.

Data Elitism

Working with data can require a lot of technical skill. And data can tell stories and reveal truths that an organization may not want to share broadly. Why not centralize your efforts and limit access to data to the highly trained few who can be trusted to bring order to chaos?

Like an over-eager police force hunting down deviants, this IT-led vision of business intelligence focuses on control, consistency, and data management. An extreme approach, however, comes at the expense of the individuals who use the data. Distancing analysis from the people who must use it results in data producers and their products that are disconnected from the decision- making process. Data products aren’t trusted and they often aren’t useful. All the problems of a command and control economy emerge.

In Search of Understanding

An organization’s capability to make fluent decisions from data depends on how well the organization knows itself. Self-awareness helps you answer the difficult questions: What does success look like? Are we moving in the right direction? Who should we compare ourselves to?

For a new organization—especially one in an emerging market—it takes time to figure out what matters most. These organizations often lack focus in their data analysis, measurement, and communication while on the path of discovery. Even with the best intentions, organizations can struggle to make good use of their data as they search for the information and metrics that will align with their emerging strategy.

 

Each of these areas is a failure of one or more of the quadrants in the data fluency framework. It may be a lack of leadership, a organizational culture that prefers anecdotes to data, or sparse skills for delivering data in ways that are easy to digest. 

For more, buy our book — or download chapter 1 to see if it seems like something that might be useful in your work.

Our new book Data Fluency may not be for you

...but you probably know someone who should read it.

If you've been following Juice for a while, you know what it is to effectively communicate with data. You know that sweating over a finely-honed analysis is of little use if your audience misses your message. You understand the frustration of presenting a beautifully-designed dashboard only to have the discussion derailed by a debate about what a metric means or how to read a chart. You might agree that the challenges in analytics are less about technology, and more about people, culture, and shared understanding. You've seen that healthy discussion about data is as important as "Big Data" or creative visualizations. 

Sorry, we didn't write Data Fluency for you. We didn't need to.

We wrote the book for those around you to expand appreciation of these principles to others. We believe that data fluency (which we define as the ability to use the language of data to fluidly exchange and explore ideas within your organization) needs to be pervasive to be truly effective. Being the only one in your organization who is great at communicating with data is like being the first fax machine.

How do you expand these pockets of data fluency to entire organizations? How do you create a culture that encourages effective data communication? What capabilities and skills are necessary to put data to be put at the center of decision-making and discussions?

We wrote Data Fluency for the people who can be part of this change. Perhaps it is your boss who needs to better understand the untapped potential of data. Or a colleague responsible for reporting, but who is still learning how to communicate with data. Or your sales team who may be both data-starved and a little data-phobic. 

When Nathan Yau and Wiley approached us about writing a book, we knew that the world didn't need another guide to dashboard design (we'd already written a white paper that did a pretty good job), more lessons in visualization fundamentals (Stephen Few has that covered), or a practical guide for visualization practitioners (Nathan's done that with Visualize This and Data Points).

We wanted to provide a fresh perspective that answered a different question: How can organizations more effectively incorporate data into their decision-making?

Data Fluency is intended as:

  • A roadmap for transforming an organization with a lot of data to one that uses that data to share ideas and knowledge.
  • Practical advice for both consumers and producers of data products (reports, dashboards, analyses). It takes both an effectives presenter and a willing audience for the data to flow freely.
  • A guide for executives who are energized by the opportunities to make a smarter organization, but puzzled by their organization's struggle to be more data-driven.
  • An inventory of the skills and capabilities needed to be data fluent, and an opportunity to see where you stand.

At the core of the book, we've provided a framework for thinking about all the parts that need to come together to build a data fluent organization. We have identified the four elements you need to build a data fluent organization. They are:

  • Engaged and educated data consumers;
  • Skilled authors of data products;
  • A culture that encourages communication with data;
  • An ecosystem of people, processes, and tools that supports the production of quality data products.
Our Data Fluency Framework

Our Data Fluency Framework

Our hope is that this book starts a new kind of conversation in the analytics field -- one that incorporates the people side as much as the tools, techniques, and technologies. We hope it spurs individuals and organizations to start on a journey toward making data a more useful tool for sharing ideas.

Nathan Yau is making a chapter from the book available on his site Flowing Data. Or skip straight to Amazon to buy the book Data Fluency.

Fantasy Football is Teaching Data Fluency

 

Fantasy football season is here again (along with the actual NFL season). I thought it a good time to share a section from our upcoming book Data Fluency, scheduled to be published in October through Wiley and with Nathan Yau of FlowingData as editor. In this excerpt, we suggest that Fantasy Football has taught an enormous audience to understand the language of data:

It may not be a stretch to say more Americans have learned about data and statistics through fantasy football than every college statistics course in the country. Each week, some 19 million NFL football fans spend their Sundays meticulously setting team line-ups based on statistical projections, historical patterns, and analysis of week-to-week variance. The couch potatoes who once relished on-field hits and in-game strategies now spend an average of more than eight hours a week diving into the data of the sport.

For the uninitiated, fantasy sports let fans play the role of team owners and managers by picking players for their own fantasy team and making weekly roster decisions. As the action plays out each week on the field, fantasy owners collect points against other competitors within their fantasy leagues. To win, fantasy owners quickly realize that success often depends on studying player and team performance data closely.

Here are a few ways that NFL fantasy players incorporate data into their thinking:

Variation in Player Performance

The best fantasy owners understand the nature of week-to-week variance and its relationship to earning points. For example, touchdowns generally earn a fantasy owner six points; but touchdowns occur rarely and can fluctuate wildly. In contrast, the number of touches players receive may be a better indicator of how much the team is using them and their opportunity to provide the owner with points. Because consistent performance matters, successful owners often focus on players with more stable predictors of success (for example, touches) versus more sporadic events (for example, touchdowns).

Rankings Can Be Misleading

Fantasy football cheat-sheets offer rankings of players in every position. These ranking mask the differences and dispersion of expected performance. For instance, the top running back may be expected to perform 20 percent better than the second rated running back, who in turn is only expected to score 5 percent more points than the third through sixth rated running back. The data shows that players often cluster into tiers of performance. This statistical understanding was publicly explained by Boris Chen who stated that “players within a tier are largely equals. The amount of noise between the ranks within a tier and actual results is high enough that it is basically a dice roll in most situations.” This concept has been widely adopted by fantasy owners as a player drafting strategy. 

The Only Constant Is Change

The worst fantasy football owners are stuck in the past and pick players and teams that they have relied on in the past to generate points. That is, they fail to update their assumptions about the best teams, players, and trends. Following the data closely reveals when certain players have gone past their prime and when teams that once had high-scoring offenses can no longer put up big points. Clinging to past success may be a formula for disaster because the only constant in fantasy football is change.

Context Fills Out the Picture

Data viewed in isolation can be deceiving. Say, for example, that your top wide receiver scored only one-half the number of points that he scored on average in a season. Is this a new and troubling trend? Should you trade? A little research might reveal that he matched up against one of the league’s top cornerbacks, or his quarterback was knocked out of the game, or perhaps he tends to perform poorly in cold weather, away games. These environmental factors make a difference with respect to outcomes. Performance data cannot be understood in isolation—context matters.

So how did fantasy football create legions of fans who have developed a specialized dialect of data fluency? It has been a combination of education, effective data presentation, common data conventions, and incentives. Fantasy football owners have been taught how to use data to their advantage through the efforts of the NFL, ESPN, Yahoo!, and a cloud of other websites dedicated to football analyses. Organizations like Football Outsiders built new media businesses around data modeling and projections of player performance. 

Leading online fantasy football sites like ESPN and Yahoo! have been aggressive in pushing data and data visualizations to their users. These sites include trend charts for every player, drive charts, player comparison graphics, and predictive models for estimating game outcomes.

The educated fantasy football community is also highly engaged with the sport. The community loves football! The fantasy league has provided a whole new (and rewarding) dimension to its fandom. No longer is it tied down to rooting for a single team—instead, the whole league becomes fodder for its attention as it picks and chooses players from each of the 32 NFL teams. In addition, the fantasy football industry has coalesced around consistent formats for leagues, points, and key metrics. Terms like PPR, running back by committee, waiver wire, and flex are well understood, facilitating conversations among league owners. And with $1.18 billion bet in fantasy football leagues annually and a passionate fan base, fantasy owners have huge incentives to make informed decisions. When money or bragging rights are on the line, individuals invest time and energy into developing the skills and abilities to become data fluent.

In short, these factors have brought data fluency to the masses. Millions of fans have learned how to read charts, grasp basic data concepts, and allow deeply embedded data to inform how they make decisions—all critical skills associated with quadrant one in our framework.