Analytics Roundup: Visualization goodies
By Chris Gemignani
May 28, 2007
Find more about:
analytics
business
charts
economics
gis
it
management
mapping
powerpoint
presentation
semantics
visualization
- The Problem With Presentations
- Don't let presentation software keep you from getting your story across,
- Webfoot's Mapeteria: Map Colouring
- Want to make a choropleth thematic map (i.e. coloured based on your data) for Canadian provinces, U.S. states, or French départements?
- The Econ 101 Management Method - Joel on Software
- Instead of having smart people figure out how to train their frontline customer service workers to serve customers well and profitably, they make up metrics that sound good and let the low wage, high-turnover customer service people come up with their own.
- The rise and fall of IT | Perspectives | CNET News.com
- Scrap IT? A well-reasoned argument for scrapping the term "IT."
- Gallery of Data Visualization
- Summize - Summarized product reviews
- A nice visualization for showing rankings.
- The Extreme Presentation(tm) Blog: Choosing a good chart
GIS for the rest of us... or Another Excel-based Mapping Tool
By Zach Gemignani
May 17, 2007
Find more about:
geocoding
gis
googleearth
mapping
A client recently asked us if we could help him find the five locations that have the most customers nearby. I dug into the annals of our blog to find two tools that together ("wonder twin powers activate... shape of mini-GIS solution") could solve this problem:
1. Juice geo-coding tool lets you get the precise latitude and longitude for a list of addresses and plot these locations in Google Earth.
2.ZIP code finder lets you enter a US address and returns zip codes within a certain number of miles of that address.
Led by our resident Excel guru David, we combined these features into an Excel tool that lets you answer common location-based questions such as:
- How many libraries are within 10 miles of Worchester, MA?
- Which cities have the most libraries within 10 miles? 20 miles?
- Could I see the library locations on a map, please?
Here's how it works:
1. Drop your location information into the "Geocode data" tab. We are using the Yahoo geocoding service.

2. Go to the results tab and fill in the table with your selected location addresses and distances. Press "Calculate."

Download the Excel file here: Juice Distance Tool
We had a couple other features in mind, but thought it would be better to get some reader reaction before we loaded it up.
This tool is released under the MIT license.
10 comments | Show all comments only the last 5 are shown
Bazily said:
Cool tool. I built a geocoding tool for the web off of Google Maps and Yahoo Geocoding API, hosted here:
http://www.bazily.com/freegeo
It'll do thousands of geocodings in a snap!
Recently I was asked to use that experience to figure the distance between 5,000 retail locations and 30,000 competitors. Awesome.
Hagge said:
Doh.
F-Secure thinks the Excel-file contains an unknown virus.
Hagge said:
I submitted the file for whitelisting, or whatever it is that they do to it, and the next f-secure database-update will fix the erroneous detection.
Chris said:
This is an excellent tool that offers me a quick way to visualize my data. However, the location names do not seem to load into the Google Earth file. I'm using Excel 2003 on XP.
Aran said:
Cool file. I have a long list of zip codes that I would like to experiment with. I wonder if there is a way to tweak the file so that it will geocode a zip code if the other address cells are left blank.
Zach said:
Last I tried, Yahoo's geocoding service (the one baked into this tool) returns a lat/long value for just a zip code.
kelvin said:
This is a GREAT TOOL! but ya, I can't see the site names from the JUICE tool...however, geocode 3.1 xls works, BUT it ends with a debug error after it completes the macro. the end result file provides the site names tho, which is VERY HELPFUL for me! is there a way to plugin the address for each site name/location? that would be ideal.
I use this tool to help me assess the number of apartment complexes in a given area. VERY USEFUL for me to get a quick idea of competition in a particular area i'm looking to invest in!
Jerry said:
Chris (#4) It has nothing to do with Excel 2003. The outputKML macro is referring to wrong column numbers. I changed my Geocode tab to make the Name/Description column into two separate columns & then changed two lines to reflect it:
name = CStr(ActiveSheet.Cells(r, 7))
description = CStr(ActiveSheet.Cells(r, 8))
Jerry said:
If you're behind a firewall, the calls to http.send will fail ....
Function yahooAddressLookup(addr As String, Optional Row As Long) As String
' perform RESTian lookup on Yahoo
Dim response As String
URL = "http://local.yahooapis.com/MapsService/V1/geocode?appid=yahoo_test&location=" & addr
'Create Http object
If IsEmpty(http) Then Set http = CreateObject("WinHttp.WinHttpRequest.5.1")
'Send request To URL
http.Open "GET", URL
http.send
....
....
Only way around it is to add code for http options to work with proxy authentication
Eric said:
Do you guys know how I can implement something similar to your Excel geocoder into MS Access where I can use a table of address records instead of a sheet of address rows in Excel.
Add a comment
Analytics Roundup: Naming matters
By Chris Gemignani
May 14, 2007
Find more about:
branding
display
howto
juice
marketing
powerpoint
presentations
productivity
tagging
visualization
- Igor | Naming companies, naming products
- Definitive essay on naming from the company that inspired our name.
- Vox Populi: Best practices for file naming | 43 Folders
- One approach to a tough problem that we all have.
- Seth's Blog: Worst powerpoint slide ever used by a CEO
- Pretty bad, but surely not the worst.
- HARLEM-13-GIGAPIXELS.COM
- Extreme data presentation in a different realm.
Why make 100 charts when one will do?
By Chris Gemignani
May 10, 2007
Find more about:
dashboard
excel
tools
tutorial
Charts are a great way to explore data. Here is some American baseball data showing player salaries over a five year period.

Charting this data with a line chart would allow us to see trends in salaries by team. However, when we use Excel’s default chart, we get something that looks like this:

That’s quite a mess. It would be a lot easier if we could create one chart for each row.
The OFFSET function is going to help. In its simplest form the OFFSET function works like this:
OFFSET(anchor, rows from anchor, columns from anchor)
That is, OFFSET will start with the anchor cell, go down a number of rows from that anchor and over a number of columns and return the value it finds.

We can use the OFFSET function to create cells that pull a single row of data out of the table dynamically. We create a new row atop of our data and create a series of OFFSET functions that all rely on a single cell (the big yellow one) for their row offset. So changing one cell will pull different rows of data into our fixed location.

Now, chart the data that doesn’t move.

After fixing the chart, we’d like to make it easy to change the value in the big yellow cell.

We can use Excel Forms to build a lightweight user interface. Bring up the Excel forms toolbar by rightclicking on any toolbar and choosing Forms. Place a scrollbar beside the chart.

Right clicking on the scrollbar allows you to Format Control. Link the control to the cell that is controlling all the row offsets. Now, moving the scrollbar will update the chart.

Now, the scrollbar controls the chart. Here is the baseball spreadsheet for you to play with: Baseball_offset.xls Have fun!

Note: this post is adapted from a presentation I gave at eMetrics 2007 in San Francisco.
34 comments | Show all comments only the last 5 are shown
derek said:
Very nice, but don't be too quick to discard the value of all the other lines. Why not present them as in the original mess, but in the thinnest strokes of identical lightest grey, against which your chosen line can stand out in bold blue?
Chris Gemignani said:
Thanks, Derek. Man, sometimes the posts just write themselves.
I've used the technique you describe and it can work very well indeed. In the past, I've done this this VBA, but probably a clever use of offset could make this work without the need for all the coding. We'll write this up sometime. Thanks!
derek said:
Clever use of offset? I added the whole lot in with Add Data, turned one line grey, used F4 and the up-arrow key to repeat thirty one times (you soon get into the rhythm :-) ) brought the blue offset line back up to the top of the stack ("Series 33") and finally, fixed the chart title link, that got destroyed by the Add Data stage. <a href="http://www.branta.demon.co.uk/infographics/Baseball_offset_2_derek.xls">See example</a>
Were you thinking that the grey line representing the selected data must be excluded? No need, it's hidden under the blue line.
Chris Gemignani said:
Beautiful. For the record, I was going to hide the grey line under the blue too. ;-)
Madan said:
Brilliant! I've been struggling with understanding the offset function for weeks, this made it finally make sense.
pg said:
Beautiful! And can I copy or export this little lovely Format Control box into PowerPoint, say?
derek said:
Sadly no. If you want to go to a live Excel application in the middle of a presentation, I suggest you design a hyperlink (Ctrl-K in Powerpoint) to a file that you already have open in Excel, and the Excel formatted to be as blank as possible (Full Screen display, Tools>Options>View set to remove all possible Excel features like gridlines and column headers etc.)
Josh said:
Yet again...SWEET tip. Thanks Chris.
ross said:
Yeah real nice,
The only thing i might do differently is place the scroll bar next to the data list, rather than the chat, might be a bit more intuative? line the way the lines is highlighted, good stuff.
Jeff said:
I might be missing something but I cant figure out who you get the chart title to change dynamically with the selection? I can trick it by labeling the series with the team name (and repositioning) but you have me stumped on the header function?
Darrell said:
Jeff, I know what you mean, its tough to now what team you're selecting. This is a minor usability issue, and not what Chris was attempting to demonstrate. The Offset function is mucho powerful for selecting different data sets in a big data table.
It would be better to have the user select from a list of the team names, rather than clicking sequentially through each one. This can be done with either a data validation, combined with a match() function, or a pick-list form. Incidentally I often use the cell underneath the pick-list form so it doesn't show when viewed / printed.
Another Tip:
I noticed that the Yankees 2005 number exceeded the manually set scale (223 vs. scale max of 200). This is a nuisance to detect/correct on one or two charts and a maintenance hassle if you have many charts.
One way that I found to have Excel automatically set a uniform scale among different data sets is to use the Min() & Max() functions over the entire data set and then Plot a fake data series (or two) with the min & max data.
In order to hide the fake data from displaying, you format the fake series by setting the Line & Marker to None (Format Data Series, Pattern: Line: None Marker: None). and then set the Value Axis Scale set to Auto. Excel accounts for the min / max data series when it calculates the Auto Scale setting and will always choose the same scale (from what I tested).
In this manner, the data will have a uniform scale among the different teams, and you don't have to worry about data exceeding the chart scale, or the Arizona Diamond backs looking equally profitable as the New York Yankees :)
Darrell said:
Jeff, a correction. I don't think I answered your question: you wanted to know how the chart title changed.
When you chart something in Excel, Excel guesses the chart title by picking the name of the first data series (the first argument of the first data series). In this case it is the
'Dynamic Charts'!$b$4 that instructs Excel what to call series One and the chart title.
The series in the spreadsheet defined as:
=SERIES('Dynamic Charts'!$B$4,'Dynamic Charts'!$C$3:$G$3,'Dynamic Charts'!$C$4:$G$4,1)
=Series(Arg1, Arg2, Arg3, Arg4)
where:
Arg1 is Series name, and default chart title if its Series #1
Arg2 is the names assigned to the category / X-axis
Arg3 is the data you want to chart, and
Arg4 is the number of the series
darrell said:
Jeff,
I was looking at Chris's file, not Derek's. I know the trick Derek used.
This tip works for any of the text boxes in a chart (that includes Chart Title, or even individual Data Labels). You can customize the text, either by overtyping it (Select box, hit F2 and type text), OR specifying a cell link.
To do this is a minor trick. You select the box, pause, Hit F2 and type the cell link into the formula box. I find it easiest to use the formula bar, rather than the text box. If you do it wrong, it creates an extra text box in the middle of the chart, if you do it right, the text is whatever is in the cell.
In this case, he selected the Chart Title, did the F2 - edit trick, and entered "='Dynamic Charts'!$B$4" .
Note that you cannot do a formula in the text box, you can only do a cell reference or a named range to a single cell. BUT the cell you reference can use any combination of formulas. Its especially useful to use concatenate() to incorporate figures or amounts into a title or data label.
Example:
in cell $b$4, you could enter
="New York Yankees - 5Yr total $"&sum(c4:g4)
and the chart will update have the title:
New York Yankees 5Yr Total $865.936
Better formatting:
="New York Yankees - 5Yr total $"&fixed(sum(c4:g4),0)&"MMs"
New York Yankees 5 Yr Total $866MM
That type of simple stuff wows my boss everytime :)
Tony Rose said:
Impressive! There has been a lot of great comments to help improve and build upon Chris' post. I do think there is some value to showing all of the basic line charts grouped together so you can visually see the differences in one glance, like Derek's example. Also, you may add a 5-year average column to sort from highest to lowest versus alpha. Keep up the great work!
Anyone else shake their head at the Yankees payroll?
derek said:
A recent discussion <a href="http://www.perceptualedge.com/blog/?p=116">referred to in Stephen Few's blog</a> led me to the pdf he wrote (because he couldn't put the graphics he needed to refer to in the comments box) about using Spotfire to create the (stupidly named, but actually quite clever) "parallel coordinates" graph type. This has inspired me to think of using Chris's baseball spreadsheet as the basis for a "Poor Man's Spotfire" using Excel.
I've made a start creating a view that "brushes" (fancy word for "selects") "The Highest/Lowest N baseball salaries in the year nnnn" and highlights those in thick blue against the thin grey of the rest of the teams. I used INDEX/MATCH instead of OFFSET. The next thing I want to do is try to highlight the "N teams whose pay profile looks most like Team T", using a least-squares fit.
When and if I make it a little more robust and add some more of the Spotfire functions Few uses in his article, I'll post it to Chris. Or y'all can start picking up the challenge on your own. How far can you push Excel to mimic exploratory data analysis with parallel coordinates and brushing? :-)
Chris Gemignani said:
Great idea, Derek. We were playing around the other day with another concept in Spotfire "heatmaps" for displaying tabular data. These turn out to be very easy to recreate in HTML tables. We'll post a solution sometime soon.
Fumiko said:
Thank you so much for sharing great ideas as always. It's clean and elegant, and overcomes the mess we could easily fall by using standard chart alternatives provided in original Excel.
I already have a data that I could use this solution. Once again,thank you for sharing this tip with us!!
craig said:
how can you switch the x and y axis when using Add Data?
thanks
Michael said:
How do you dynamically highlight the row item as the graph changes?
Chris Gemignani said:
Michael, I use a formula-based conditional format that looks roughly like: "ROW()=$A$2"
Sasikumar said:
It is really helpful to us to make a good presentation. It is very intresting also. let me know these type of other features in excel graphs
Mark said:
NICE IDEA!!! This would be REALLY useful for comparing multiple scenarios, which would mean I could select up to say 3 or 4 to plot at the same time. I will give that a shot, and thanks for the great start.
N Shivkumar said:
Excellent tip as to how to use control. Thanks a lot and keep posting new ideas.
derek said:
<a href="http://peltiertech.com/">Jon Peltier</a> has created a <a href="http://junkcharts.typepad.com/junk_charts/2008/01/football-rank-2.html">parallel coordinates tool in Excel</a> here, to explore quarteback preformance data. The two chart controls choose which player to highlight in red and which in blue. the rest of the players are pushed into the background as gray lines.
Randy said:
This is great. Is their a way to use the same scrollbar on multiple figures. Specifcally if I want to pictures side by side that flip as I scroll.
andrew said:
I never usually comment on stuff like this (I know, I should)...but this is AMAZING. Thanks!
Andrew said:
Very useful - thanks for sharing this.
Adam said:
This is helpful. I am trying to do something similar, and maybe someone here can help. I would like to create a bar + line graph and have the date range (x-axis) be controlled by a drop-down menu. Ideally, I would like to be able to select a beginning date and an end date and have the graph adjust automatically. Is this possible?
Thanks
Bill said:
This is truly extraordinary. I will build from this and wow everyone. Thank you for sharing the file.
ray said:
great and simply outstanding!!!
Tim said:
This is great!! This is such a time saver. One question, how do you auto scale the axis in case you have a very wide range of data. For example in one graph I need a to scale to 200, in the next one I need to scale to 5000.
Thanks and keep up the great work you're doing.
Sean said:
It sounds like an amazing concept, however, whilst using Excel 2003. It seems that the 'Control' tab on the 'Format Control...' window is missing. Would you be able to provide a sample excel document so that I could look at the VB code behind the scroll bar? or provide the code on here?
Many thanks!
Sean said:
Oops, I didn't see the sample document mid way down the page!
Many thanks!
Yan said:
Love it! Very easy to understand. Thanks!
Add a comment
The Google Analytics relaunch
By Chris Gemignani
May 9, 2007
Find more about:
analytics
bi
innovation
interface
tools
Google Analytics has been rebuilt and the result redefines the frontiers of doing analytics on the web. Avinash Kaushik has the definitive early review.
I had the privilege of attending the launch and playing with the early release. Here are a few things I noticed.
- Speak my language: Google has put a lot of effort into replacing specialized terms with everyday ones. This makes the application usable by a broad base of people and is one way to fight GUI Jock-itis.
- Speed kills: The interface is easily reconfigurable and fast. I've long argued that interface speed is a substitute for configuration options. I'm curious to play with the tool and get a better sense if this is true.
- Flex rules: Much of the componentry for viewing data in Google Analytics is built in Adobe Flex. This is similar to Google Finance, and not at all like GMail or Google Reader, which use the GWT. We believe this has profound implications for analytical tools on the web and will dig into this in later posts.
2 comments
FM said:
Nice and must say timely review, I've been using Analytics.google.com and found it good, some time.
some time it's not an average site owner would like to look at, i mean you may lost your way through it.
Talking about Speed, it's been major sat back, till the day. however if it's improved in relaunch, it's great.
Lynn Cherny said:
I had lunch with a friend recently who told me that the Mindmaps fellow who built the new Google Analytics was just more used to Flex. So it may not be sinister or deep, although I agree with you on the overall benefits of the feel and design :-)
Add a comment
Dashboard Storytelling
By Zach Gemignani
May 7, 2007
Find more about:
dashboard
infovis
interface
reporting
visualization

Everyone wants a dashboard and the promise of a world in which the intricacies of your business are clearly laid out on a single page. Dashboards can make running your business as easy as driving a car, where slight adjustments and careful attention to warnings mean smooth sailing on the road to success.
I'm not so convinced. For someone who is, check out the mysterious Dashboard Spy. He/she has a massive collection of dashboard screenshots and describes these precious morsels as "simple to understand and impressive to look at, these scorecards are becoming 'must-haves' for all enterprises."
If we already live in a dashboard-centric world, we might as well do them right. I see at least three areas where dashboards need improvement: depth, information display, and storytelling.
Depth. Stephen Few makes a worthwhile distinction between dashboards and something he calls "faceted analytical displays" (FADs):
- A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.
- A faceted analytical display is a set of interactive charts (primarily graphs and tables) that simultaneously reside on a single screen, each of which presents a somewhat different view of a common dataset, and is used to analyze that information.
We might consider dashboards a static version of FADs (or we could consider FADs a versatile dashboard). If that's true (and I'm sure Stephen will step in to correct me), then who wants a plain dashboard? Why build something that only raises questions but doesn't give the user any ability to drill down, explore, tweak parameters, or otherwise try to answer those questions?
Information display. Like most reporting, dashboards suffer from poor information design. Here's our list of blogs that preach the right way and highlight the offenders. Here are two particularly misguided design approaches that I've seen recently...
Just because it is called a dashboard doesn't mean you need to take the concept literally (via Dashboard Spy)
Just because you can make it shiny doesn't mean you should. Crystal Xcelsius not only vigorously embraces pie charts, but they add a "reflective kidney bean" to further derail the information display.
Storytelling. Most dashboards are loose affiliations of charts—a hodgepodge of graphics on the same topic intended to offer a full view of a situation. It is the same problem so many people run into in creating PowerPoint presentations.
You want the information to easily slide into the viewer's brain and stick when it gets there. The best dashboards have story-like features such as:
- Set the stage. What is the context? Who are the characters?
- Focus on only the important elements and themes; don't try to be a comprehensive account of everything that happened. Ruthlessly cut extraneous content.
- Offer recognizable characters to spare the reader's precious attention. There is a high cost to asking readers to learn from scratch. For dashboards this means terms, metrics, graphics, and metaphors that are familiar within the organization.
- Create flow and cohesiveness from chapter to chapter. Themes and characters reappear chapter after chapter. A good dashboard isn't a bunch of disjointed charts, but a logical flow from one analytical examination to the next.
- Levels of detail. Some elements of the story span the entire experience; other details provide the insights and seasoning to keep your interest.
Here's a good example of a dashboard (perhaps FAD) from Visual I-O that has many of these storytelling elements.

In contrast, the following dashboards (courtesy of Dashboard Spy) don't attempt to explain anything to the reader:


If you've seen a worse dashboard, sent it our way and we'll put together a gallery of the worst of the worst. Please redact any company-specific information.
9 comments | Show all comments only the last 5 are shown
nixnut said:
According to Steve dashboards and FADs serve two distinct purposes. In Steve's words:
The greatest clarification that is needed today is a distinction between dashboards, which are used for monitoring what’s going on, and displays that combine several charts on a screen for the purpose of analysis.
So dashboards are for monitoring and FADs for analysis.
A dashboard displays predefined measures that may come from a multitude of data sets.
A FAD will usually display different views of one dataset.
A dashboard is for displaying answers to existing questions.
A FAD is for discovering new things of interest in a data set. Manipulating a FAD leads to new questions and (hopefully) answers to these questions.
So why would one be satisfied with a mere dashboard? A dashboard can display measures covering a great number of areas of interest using data from a large number of sources. The value of having all the measures that are important for your work available in one display may outweigh the value of being able to directly manipulate the display to slice, dice, drill up/down/left/right etc. I think having a seperate tool for analysis makes more sense then forcing your dashboard to do a job it is not designed for or trying to make your analysis tool be a dashboard as well. Use the right tool for the job.
Also analysis might not be the most effective way to spend your time on or even be your job. I reckon a senior manager is more likely to put an assistant onto finding an answer to a question raised by a dashboard measure than performing the analysis himself. In operational monitoring you may just need to keep track of things and inform the relevant team if the value of a measure reaches a threshold. Finding out why the threshold is reached would not be your job.
Zach said:
Putting aside qualifiers like "it depends on the situation", I don't think I agree with the distinction as you've described it. A couple reasons:
* Senior managers should be willing to spend some time examining data beyond looking for a warning light. That isn't to say they should be running logistic regressions, but it is worth the mental effort to discover which division is causing a deviation or understanding sales variance by day of the week. In my experience, understanding nuance is what separates the good leaders (enter political joke here: _____)
* More generally, I think it is artificial to make a strong distinction between raising questions and answering questions. Granted it may be difficult or impossible in some situations to have a tool that does both. Why not make that the goal? The Visual I-O tool shown above will highlight strong and weak performance, and the ability to cut the data will help answer some of the questions about this performance.
* FADs need not be constrained to a single data source. Getting the full picture of a situation usually requires tapping into multiple systems.
Michael said:
Nice blog post. I think you made a pretty strong point. There is another angle here but I'm not sure how it fits in. The niche that I am personally interested in is the fact that many dashboards as well as FADs are backwards looking. They rely on the individual to extrapolate what they see, into where sales might end up for the month or the direction sales are headed for the year. With the amount of predictive modeling and simulation analysis that is available, I would really like to see dashboards incorporate much more sophisticated analysis. This doesn't mean that the display of information should be less intuitive, but that the underlying drivers that create the information to be displayed, could be so much more useful.
For example: I know that as of May 7 we have closed 20k in sales. I may also know that historically, in May, we have had 30k in sales booked. I also know that our plan says we should have 35k in sales by the 7th. Of course, there is always uncertainty in any forecast and sales could pick up significantly over the next couple of weeks. Maybe a monte carlo simulation could be run in the background and surfaced to the dashboard in such as way that the indicator tells us that based on current data our expected month end results are:
85% chance of hitting 35k
90% chance of hitting 30k
98% chance of hitting 25k
etc.
Kevin Hillstrom said:
Dashboards are fun to implement when you have a company with 60,000 employees and 130 executives who have differing ideas of what is important.
But if you can get past that, you can really teach leaders what is important via a dashboard.
nixnut said:
Hello Zach,
Perhaps my example of senior managers is a wrong example. I was merely trying to point out that there are groups of dashboard users that would use a dashboard for monitoring and not for analysis. I think that the difference between monitoring and analysis is large enough that the principles of perception and cognition would lead to different designs.
Having a tool that is capable of designing visual displays to cover the full range from monitoring KPI's to sophisticated analysis is indeed something to aim for. Alas such tools are not available yet (at least none that I know of).
While such a tool would allow you to build both dashboards and FAds these are still different things imho. The dashboard would still display the KPI's (or their derived metrics) and the FAD would let you look at the numbers behind the values for these metrics, generate and look at different perspectives, possibly do some simulation etc.
I didn't say FADs are restricted to one data source. I said they would usually work with just one dataset. I suppose my wording wasn't too clear. By dataset I mean a set of data about one subject area. In a dashboard it's natural to display metrics from several subject areas that are not related (other than by organisation) or not related in ways that are useful for filtering. Filtering or brushing as you could do in a FAD would update all the facets in the FAD to reflect the selection made. But in a dashboard that would only update the relevant metrics and leave the rest untouched. If you are interested in playing around with a metric (or a set of related metrics) it would make more sense to me to design a FAD for that purpose and use the dashboard as a starting point to drill down to the FAD from the metric on the dashboard. More sense than trying to force every dashboard to be a FAD that is.
I hope this post makes more sense to you than my first :-)
ltweedie said:
Michael
I entirely agree that dashboards/FADs showing predicitive models are a whole area that is very unexplored!
I did something very similar to this in my thesis in 1997 "The Influence Explorer" (a quick websearch will bring up the relevent papers) where we sampled a response surface model (Nelders generalised linear models to be precise) and then visualised it using interactive histograms, scatterplots and various other tools.
My experience in showing it to users was that it was immensely powerful as a tool to communicate a model. Suddenly analysts were able to make their models real. So that these models became real shared problem representations.
You could explore the relationship between inputs and outputs fluidly and easily.
I still haven't really seen this done in many other places and yet it is such a simple idea to put into practice. Has it?
Lisa
ltweedie said:
Zach/nixnut
Surely adding interactivity is about how much complexity you add to a tool.
I would say the important thing is not whether to add interactivity to a dashboard but whether it is justified in the context. So there will be situations in which allowing a user to interactively track back through time or drill down on a piece of data would be central to the dashboard design.
My hunch is that Responsive (dynamic) interaction enables a user to quickly compare many graphs in a way that is just not the same in static view. I guess we need some research to back this up - anyone know of any?
In an analysis situation we want to provide unlimited freedom. In a dashboard situation we want to provide key information quickly and clearly.
Thus I argue that in a dashboard design one should consider what key activities are going on and assess whether interactivity is appropriate. I believe there will be situations when it is very pertinent.
Lisa
Ted Cuzzillo said:
I like the idea of a “faceted analytics display.” I’m sure FADs are important. I just hate to see Stephen Few opt for this term because inept designers have spoiled “dashboard.”
Dashboard is a valuable metaphor and should be defended. I’m afraid FAD will be forgotten.
Isn’t a FAD just a dashboard with extra features? When they added tachometers to auto dashboards, did dashboards become something else? If you add new software or a new peripheral to your computer, isn’t it still a computer?
Perhaps we could think of it as a simulation dashboard--but still a dashboard. Calling it a simulation dashboard is still stronger than calling it a FAD.
Using “simulation” might force a modifier onto dashboards that don’t interact. How about “dumb”? Then we’d call the really bad dashboards just “dumber.”
Zach said:
Ted, I agree completely. FAD isn't likely to stick. Simulation (or perhaps interactive) dashboard is a good modifier.
Add a comment
Analytics Roundup: Fixing Excel and dashboards
By Chris Gemignani
May 4, 2007
Find more about:
analytics
excel
graphics
humor
precision
tufte
visualization
- Ask E.T.: Cleaning up Excel's poshlust graphics
- Discussion of the Chart Cleaner along with other approaches to make Excel look good.
- The Universe of Discourse : Excessive precision
- Humorous take on one of the ways excessive precision can creep into reports.
- Data Visualization: Intelligent Dashboard Design
- The third in a series of columns that feature the winners of DM Review's 2005 data visualization competition.
ANDs, ORs, and IFs: Comparing big lists in Excel
By Chris Gemignani
May 1, 2007
Find more about:
data
excel
One problem we face when manipulating large amounts of data in Excel is checking to see if two lists of the same length contain the same items. For instance, we might be given a list of products that a company has for sale this month, running to thousands of items, then the following month, we get another list of products for sale and we need to see if there has been any change between those two lists. This isn’t too hard to deal with when you only have a hundred or so items, but it gets a little thorny when your list runs to tens of thousands.
What we do is line the two lists up, side by side, in sorted order.

Use the simple “A1=B1” formula to compare pairs of items in the lists.

If the pairs are the same, this will be true, otherwise they’ll be false.

Copy this formula down for all your rows. Then use the AND function and give it the entire range of comparison formulas.

This will only be true if every single one of the values in your list are exactly matches. If even one comparison is false, this big AND statement will evaluate to false.
This is a quick and dirty approach. For tougher problems, we use a slightly more complicated formula in the comparison where we evaluate it to 1 if the value is true, 0 if the value is false. This gives us more flexibility to combine comparisons, but that’s a topic for another post.
19 comments | Show all comments only the last 5 are shown
Tre said:
Are you serious? This is an extremely rudimentary tutorial for this blog.
EK said:
I find these quick snapshot overviews very useful. After 20 years in analytics I am never too old to learn or be reminded of tips and techniques
derek said:
I appreciated it, although I prefer to use an array function to do the same thing. jeez, I didn't even know you could do that with Excel Boolean functions, I thought they only took parameters separated by commas, like CONCATENATE.
If only CONCATENATE did the same thing, it would actually have a use. Seriously, does anyone know what you can do with CONCATENATE(a,b,c...) that you can't do with a&b&c...?
(the first version of thos comment died without error message and was lost, so that I thought it had just gone somewhere for moderation; this may just be a problem at my end, or it may have been my failure to type "juice")
derek said:
I appreciated it, although I prefer to use an array function to do the same thing. jeez, I didn't even know you could do that with Excel Boolean functions, I thought they only took parameters separated by commas, like CONCATENATE.
If only CONCATENATE did the same thing, it would actually have a use. Seriously, does anyone know what you can do with CONCATENATE(a,b,c...) that you can't do with a&b&c...?
(earlier versions of this comment died without error message and were lost, even though I typed "juice"; it may just be a problem at my end)
derek said:
Wonderful. The first attempt disappears permanently, even though I waited for it, so the second attempt only temporarily fails to appear, so that I double-post with the third attempt :-(
James B. said:
Great new design. I love it.
I noticed that the links to state data are not working on this page:
http://juiceanalytics.com/writing/2006/03/census-data-in-google-earth/
Specifically, I'm looking for the Pennsylvania data.
Adam said:
I often use a similar technique to make sure that all the items in one list are in both. The problem is that if I just do =a2=b2, if I insert cells to add items from one list to the other, Excel automatically adjusts the references in the boolean cells and they get all screwed up. The solution I've found is to use "OFFSET" -- even if you insert cells, the references stay static, so the boolean variable is looking in the right place.
Scott H. said:
I have this problem all the time as well. Any suggestions on how to line up the rows so that they then match? I usually have to do this by hand .. inserting cells and moving the text in order to manually line the rows up.
Jason said:
Oh man, I'm the newest employee and my cube was recently moved across from the GUI jock (frequent audible sighs, grumbles and guffaws, smokers cough/cackle). And I've been learning how to run various sales reports from various databases on various days to send to various people.
At first I was happy to be learning something new, and I was feeling more like a productive member of the team, so thanks for clueing me in/bursting my bubble. But I've just finished my first year of my MBA, so there may be hope.
David said:
Juice- yeah, I've been doing this for years, but it is still news to some folks. There are some neat tricks with ones and zeroes, and even pivot tables to find missing items.
Derek- CONCATENATE() and the ampersand have very similar functions... the only benefit that I have seen to the CONCATENATE command is that it makes it easier to just click away at cells, as opposed to interrupting with an ampersand.
Jason- It's all over for you, man.
(just kidding)
David G. said:
Great tip, but it's by no means fool-proof. If just one pair doesn't match it could cause the rest of your list to come up as false - which makes it difficult to figure out which items are new.
Example:
1001 1001
1002 1002
1003 1004
1004 1005
... ...
1999 2000
2000 3000
Your entire list will be off, but you won't be able to provide a listing of what has changed - which is what I would expect the point of this exercise to be.
Wouldn't it be better to insert a column and use a combination of INDEX and MATCH to give a true comparison list that you could sort on?
ponyfizz said:
If it's just numbers, I would subtract one from the other and sort by the formula column.
All of the non-matching numbers would be grouped (at the top or bottom of the list). Simple
Rolen said:
I usually use the function, countif to handle this..
Countif(B:B, C3)
Chris Gemignani said:
3. Sorry to hear about the commenting woes. We'll fix comments in the future to appear faster. If only CONCATENATE worked on ranges!
Robbin Steif said:
To the person who wrote, this is too easy a post for Juice: you probably have a strong command of the topic, but not everyone does. And like someone else wrote: it's nice to be reminded of tricks again.
To the person who complained that then you have an error somewhere and you can find it: all you need to do is sort all the rows by the column with the proof in it, so that you can see where the problem is. You can sort Ascending, since F comes before T (how's that for easy advice?)
And finally, you can use a similar trick to dedupe email addresses, or other lists where the info will be identical. I am sure you Juicers are all over this one. But since the data are on two lines, you have to do it like this =a1=b1 And copy all the way down. Then you copy the results and repaste them right where they were, but PASTE SPECIAL - Values. Then you sort on the IF column
mike harding said:
well in the world of statistical genetics, excel doesn't have enough room in one tab. it would be useful to know how to do this in R or SAS.
AW said:
Mike,
In SAS, you'd want to use a sort-sort-merge or some inner joins.
R is on my schedule to learn in the next 2 months; so I got nothing for you there...
Brenda said:
Looks like it works for text string comparison as well!
Will said:
I think the use of MATCH is a much better option. Using the data as you have displayed it, in cell D3, you would have =IF(ISNA(VLOOKUP(C4,B$3:B$xxxxx,2,FALSE)),1,0), and then in cell D4 on downwards you would have =IF(ISNA(VLOOKUP(C4,B$3:B$xxxxx,2,FALSE)),D3+1,D3) where xxxxxx is the final row in your set. In cell D2, you have =MAX(D3:Dxxxxx). This will then tell you how many new datapoints you have.






0 comments | Add a comment
said: