Consider data set extremes


There are a few ways to think of data set extremes. The most obvious seems to be actually looking at the outliers of a data set. Whether you have a little or a lot of data, chances are there will be some data points that aren’t nicely lined up with everything else. You may have one big cluster and then a few others that throw everything off, including scale. It’s important to consider those outliers when thinking about your design or chosen visualization. How will you handle them? Will you not include them in some views, but have other views where they can be seen?

Another important part of data extremes are the amount of granularity in your data and what you actually want to show. If showing an amount of time all the way down to the seconds is available, ask yourself if it’s really valuable for your purposes before including it. Also look at the length of time over which your data expands and how you choose to show that. If you show data from a long period of time in a small amount of space, the amount of granularity that someone will be able to see may be minimal.

Comparisons are also quite important. If you have two charts that are being compared, but the minimum and maximum numbers are quite different, scaling the axis’ appropriately so they can be accurately compared is vital for quick and easy comparison. If the axis’ are not to scale comparatively, people may actually misread the data at first glance.

Other types of extremes to consider are text based extremes. If you have some lists that will be really long and others that will be short, how should you handle that? Word labels will vary in size and you may have some that are really long. Think about the approach to that. Should you show part of the label and have the full label show upon rollover? Or should you have the label jump to two or three lines, if necessary? Is there a better solution? Each scenario is different, so think about what works for what you are doing.

Also, be sure to account for perceived color differences if you are choosing colors to represent your data. The more categories you have, the more tricky this can become. If you think two colors may be a little close, find different colors. You may know that they are different, but others may not seen those differences if they aren’t pronounced enough.


Poor Usage
This Network of Data Visualization references does indeed have lots of useful information; however, the enormity of the data in combination with a network diagram leads to your classic hairball example of mismatched design and purpose. There is an attempt to remedy this with a fisheye interaction, but it manages to both not solve the problem and add difficulty actually making selection.


Good usage:
Gapminder World examples one good consideration to deal with its extremes by presenting an optional toggle directly on the axis to switch between a linear and logarithmic scale.


Related Principles
Communicate true scale
Time contextualizes
Choose the right chart
Focus with filters