Python Geocoding Help
By Chris Gemignani
February 28, 2006
Find more about:
googleearth
juice
python
Yahoo recently released a nifty geocoder API that's free for small (<50,000 lookups per day), non-commercial applications. Rasmus Lerdorf (Yahoo's PHP king) has written a nice introduction to using this geocoder in your PHP apps. In that spirit, here's a cheap and cheerful Python class that we use to geocode addresses.
from xml.dom.minidom import parse
import urllib
class Geocoder:
"""
look up an location using the Yahoo geocoding api
Requires a Yahoo appid which can be obtained at:
http://developer.yahoo.net/faq/index.html#appid
Documentation for the Yahoo geocoding api can be found at:
http://developer.yahoo.net/maps/rest/V1/geocode.html
"""
def init(self, appid, address_str):
self.addressstr = addressstr
self.addresses = []
self.resultcount = 0
parms = {'appid': appid, 'location': addressstr}
try:
url = 'http://api.local.yahoo.com/MapsService/V1/geocode?'+urllib.urlencode(parms)
# parse the xml contents of the url into a dom
dom = parse(urllib.urlopen(url))
results = dom.getElementsByTagName('Result')
self.result_count = len(results)
for result in results:
d = {'precision': result.getAttribute('precision'),
'warning': result.getAttribute('warning')}
for itm in result.childNodes:
# if precision is zip, Address childNode will not exist
if itm.childNodes:
d[itm.nodeName] = itm.childNodes[0].data
else:
d[itm.nodeName] = ''
self.addresses.append(d)
except:
raise "GeocoderError"
def repr(self):
s = "Original address:n%snn"%self.addressstr
s += "%d match(s) found:nn"%self.resultcount
for addr in self.addresses:
s += """Match precision: %(precision)s
Location: (%(Latitude)s,%(Longitude)s)
%(Address)s
%(City)s, %(State)s %(Zip)s
""" % addr
return s
if name == "__main__":
sample_addresses = ['555 Grove St. Herndon,VA 20170', '1234 Greeley blvd, springfeld, va, 22152', '50009']
for addr in sample_addresses:
g = Geocoder('YahooDemo', addr)
print '-'*80
print g
All you need to use this is a Yahoo application id.
You now have four different ways to geocode your company's vital address. If you have suggestions or improvements, let us know. This code is public domain.
Go with the flow in data display
By Zach Gemignani
February 27, 2006
Find more about:
visualization
We spent the last couple of days working with a client on displaying data for real-time dashboards. It got me to thinking: Are there an implicit assumptions and mental habits that people bring to data interpretation? And if so—are there some basic practices to consider for visualizing data?
Which isn't to say this is a right and perfect way to display any particular data; there is room both for creativity and structure. (Check out Information Aesthetics for examples of creative data visualization.) But in the world of management communication, it can't hurt to be aware of your audiences' ingrained assumptions. You want the smoothest path to your important points. The risk is in missing your tiny window to focus a frazzed executive's mind on your point--and finding your carefully constructed analysis get sidetracked.
Here's a starter list of these embedded assumptions:
1. Axes are often the last thing people look at in a chart.
They expect time to progress from right to left and linear scales that start at zero. If two charts are adjacent, they will probably assume the axes and scales are the same. When it comes to the famous two-by-two consulting matrix, good things happen in the upper-right; bad things are in the lower-left. That said, I'm mystified that the famous BCG growth/share matrix's insists on rejecting my new rule.
2. Fluff. Dressing up your display implies you aren't comfortable with the data's ability to stand on its own or you don't have much to say. This can include clip art, data incorporated into pictures, and animation. USA Today is particularly good at this. Check out a couple of examples from their Snapshots section. They have less than three numbers to communicate, but fill it up with eye-catching graphics.


3. Point of focus. Most data displays have a clear point of focus for the viewer, whether the presenter intends it or not. It could be the peak in a line chart, values crossing over zero, or a sudden change in values. In a chart like this (below), your intention may be to highlight the general growth trend -- but you can't avoid the inevitable questions about the drop after 2000. You can short-circuit these off-the-topic questions with an explanatory footnote or annotation. Ask yourself: what is the main point I want the reader to get, and what else will my data presentation imply?

4. Proximity and size. Placing information close together suggests a connection. Sometimes accidental proximity can cause confusion. You might present two unrelated phenomena next to each other and the audience will automatically try to draw a connection (e.g. dogs have big teeth; teeth are good for crunchiing carrots. Audience thinks: dogs must like to crunch carrots). I just ran across Live Plasma, a great site that lets you enter a musical artist (or band, movie, director, or actor) then shows you related artists. The designers of this data visualization do a great job of building on our data display expectations by using size and proximity to show related artists.

3 comments
Robbin Steif said:
Live Plasma looks cool but is not intuitive enough. You point out that the axes are the last-looked-at, but here I found myself desperately searching for a legend to understand if size or proximity or color matters.
Robbin Steif
<a href="http://www.lunametrics.com/?source=blog&segment=other" rel="nofollow">LunaMetrics</a>
Zach said:
Good point, Robbin. It is hard to find the meanings for the size, color, and proximity on the site.
Mary said:
Speaking of mental habits, as you were at the beginning of this article, I am wondering if you have spent any time reading about Art Costa's Habits of Mind ideas. It is, of course, education, not business, oriented.
Add a comment
(Re) Introducing Absolutely Google Earth
By Juice Alumni
February 27, 2006
Find more about:
googleearth
juice
google
A while back we released a collection of tools and resources for Google Earth. We've restructured the page a bit and added a few new links. Check out the new version and make sure to let us know if you have anything to add.
Scaring Your Users
By Juice Alumni
February 24, 2006
Find more about:
design
interface
With the release of screenshots of Microsoft Office 12, I started thinking about how rapid of a change it was from their current interface design. Aren't they worried about scaring away the users that are so comfortable in the new design? Probably not:
- Microsoft Office has no real competitors and isn't too worried (yet) about losing users.
- Microsoft has to justify to their users that spending a few hundred dollars on an upgrade is worth it. The best way to do that, is to make it look very different
- New Interface = Users that need new training = $$$ for Microsoft
So Microsoft isn't in trouble then. But not all products have that luxury. Being a recent graduate, I was not immune from the poker virus that hit college campuses. Every once in a while I play online at PartyPoker. The other day I logged in, approved the mandatory software upgrade, and fired it up ready to play. When I opened it up, I almost gasped.
Old Interface


- Poker sites have a lot of competition and high turnover
- Most poker sites pretty much have the same features, games, and functionality.
- Users that need more training = Users that switch to another site
Every week, poker sites have promotional bonuses to try and drive people from one site to another. The only thing keeping users from switching is that they are comfortable with a sites, look and feel. If you take that away from them, you're making it easier for them to switch to a site with a nice promotion. PartyPoker would have been a lot better off gradually adding in their new features and making sure that their users absorbed the changes as they came.
ESPN is a great example of user interface understanding. They constantly are adding new features (like streaming video) and changing the look and feel of their site, but in a controlled, conservative way.
Illustrating Imprecision with Excel
By Chris Gemignani
February 21, 2006
Find more about:
analytics
excel
screencast
A few days ago Zach made a nice point about Zillow. It's oh-so-easy to produce numbers that are precise but are not accurate. Here's a quick screencast to show you one fun way to draw the distinction in Excel using number formatting.
Click picture to view video.
Note: In the screencast, I say precision when I mean to say accuracy no fewer than *four* times. Sorry.
Video of Excel 12 Business Intelligence Inaction
By Chris Gemignani
February 21, 2006
Find more about:
excel
screencast
Here's video of the new analytics capabilities coming in Excel 12, including the revisions to PivotTables. Microsoft is pushing hard to weave Excel, SQL Server, and Sharepoint into an integrated system.
It's early, but I'm concerned that analysts will have to know even more to get useful work done. Analysts would benefit from PivotTables that are easier to use rather than PivotTables that require knowledge of SQL Server, SharePoint, Unified Data Models, etc.
If you're an analyst, check out the video and let us know if the new Excel approach would work in your organization. The video is 50 minutes long. Jump to 9 minutes in if you want to get past the intro chitter-chatter.
What is analytics?
By Zach Gemignani
February 20, 2006
Find more about:
analytics
bi
A reader wrote to us today:
I seem to have spent the last few days (not including the week-end I must add) trying to get to grips with 'Analytics'. If [my boss] comes in wanting a 5 word anaswer to his question "what exactly is an analytic?" I think I'd still be at a loss as to how to define it.
It's a great question. Analytics (along with its sister/twin term Business Intelligence) gets thrown around without much clarity as to its meaning. You might think with the word in our name, that we'd have long ago nailed down a definition. Not so. (Although we do have a good understanding of what "Juice" means?)
Below is my take on a "map" of the analytics world.
I used a couple of dimensions to help frame all the parts and pieces:
- Purpose. A concept of "exploration vs. control" highlights the difference between analysis and reporting. Analysis is about digging deep into data to discover relationships, find causation, and describe phenomena. Reporting, in contrast, is used to track performance and identify variation from goals.
- Timing. Most analytics is backward looking -- in an attempt to understand what has happened, and therefore be equipped to make better decisions in the future. Alternatively, analytics can focus explicitly on predicting future performance or, in the a few cases, provide information to support decisions in real-time.
I'd really appreciate any comments on this map -- whether I've missed/misgrouped/misrepresented concepts or alternative dimensions to describe the space. The more clarity we can provide in describing "what is analytics" the more palatable the concept will be.
14 comments | Show all comments only the last 5 are shown
DEVI THIRUPATHI said:
Very good "map" of the analytical world. I am interested to associate the
phases of various analytical steps as shown in the map here to the
software tools that are available.
Analytic Process --> Software Tool
Is there any resource or web site that provides this.
Thanks and Regards
Devi
Zach said:
Great question. I haven't run across a good resource that lays out software tools by analytical process. We have used or checked out a bunch of different tools through the course of our work (e.g. Excel, Access, SAS, JMP, Tableau, GIS tools a-plenty, Business Objects, Cognos). Many of them can be stretched to cover different parts of this analytics landscape; few of them are very well targeted to solve a specific piece of the picture. Shoot me an e-mail if you have particular areas where you cannot find an appropriate tool.
KP said:
There is another classification to look at the analytical space
A classical 2X2 matrix with dimensions
Prescriptive vs Descriptive
Orgnl internal vs external
This kind of ties in with your dimensions
Anand said:
Very good presentation. I feel that the Target analysis and Top down analysis seems perfectly assigned to Exploration, but modelling/forecasting & scenario analysis/simulations should be shown under forecasting and not under exploration as these use the data which has been already explored.
Balaji Arun said:
Can somone suggest books on Analytics ? (preferably covering the basics of analytics)
Zach said:
I haven't come across a book that I'd consider required reading for analytics, but here are a few that may be on point:
* I've read some good stuff about this book: Hard Facts, Dangerous Half-Truths And Total Nonsense: Profiting From Evidence-Based Management
* General guidance on thinking analytically: The Thinker's Toolkit : 14 Powerful Techniques for Problem Solving
* From our friend Stephen Few: Show Me the Numbers : Designing Tables and Graphs to Enlighten
* Finally, you may just want to get more adept with Excel using a book like this: Data Analysis and Decision Making with Microsoft Excel
Anthony Arrott said:
I would add two books to the list for Balaji Arun:
for clarity of thinking:
Gerd Gigerenzer's
"Calculated Risks - How to know when numbers deceive you"
for practical applications:
Bill Jelen's
"Guerilla Data Analysis Using Microsoft Excel"
Devi Thriupathi said:
Dear Mr. Zach,
In continuation to your email reply on analytic process --> software tool. You have mentioned that there are different tools ranging from Excel, Access, SAS....., B.O., Cognos. Please mentioned the tools against the following applications:
Sales Forecasting
Account Management
Activity Based Costing
Capacity Planning
Inventory Management
Marketing
May I have your email address.
Thanks
Devi
Zach said:
My e-mail address: zach.gemignani@juiceanalytics.com
I wish there was an easy answer to your question about matching software tools to business applications...I don't think there is. Many of those items in your list require first a "business process" software application, then an associated reporting capability.
In many cases, you are talking about modeling activities. I haven't seen a better package for general purpose modeling than Excel.
When it comes to analyzing a database of customer interactions (like with Account management), we are intrigued by Tableau.
sudharshan sundarrajan said:
A pretty good diagram. I would like to add a new dimension(or maybe an implied one!) to the purpose. We normally classify analytics into 'Market analytics' and 'Risk analytics' in our organisation. Intelligent 'Market analytics' aids brilliantly in marketing and pro-active customer care. 'Risk analytics' deals with identifying potential risks, their 'riskiness' over a period of time, risk mitigation strategies and their effectivess etc. 'Risk anlaytics' is slowly moving a lot of business decisions in a lot of organisations from being affected by judgemental bias.
Mohan said:
I am not against analysis as a tool but there is far too much of analysis thinking that it will solve all buisness problems. Many a managers feel that real life business needs "Synthesis" more than analysis. All the parameteres of buisness environment can't be quantified and many important ones are soft ones or intangibles difficult to quantify. I prescribe more to Alexander Christopher's philosophy where more important than analysis is synthesis of which un-fortunately there is very little talk and even lesser training of managers. Our Management Institute has gone to the extent of even introducing a full fledged MBA i.e. Masters in Business Analysis. I am afraid too much of analysis may lead to paralysis. In the end no mathematical model can replace human decision making for which as yet no effective replacement has been found.
Deven said:
Hi Mohan,
Masters in Business Analysis sounds interesting. Can you please share more details of your Management institute?
Harry said:
Would you consider Predictive Analytics to cover any of the "risk analytics" that Sudharshan is talking about? Does it cover more than just the market side ?
Sateesh tadur said:
going by the terminology used in the Business analytics are there any statistical techniques thar used in the commercial context. I would like to know specific multivariate techniques applied in this area.
Add a comment
Zillow's challenge: precision implies accuracy
By Zach Gemignani
February 17, 2006
Find more about:
analytics
Zillow released its home value assessment tool recently. It is a tantalizing concept: they claim to have put a dollar value on over 40 million homes across the country. I rushed to the site and was satisfied with the results for my house. Then I was overjoyed to find that the new bathroom we are adding in the basement will increase our home value by $85,000. Nice! Better yet, I found that if I just add five more bathrooms, I can double the value of my house. I guess buyers would agree with me: it is nice to have a bathroom nearby when you need it.
Numbers like these have made some people suspicious. A recent article in the Washington Post criticized Zillow for its inaccuracies:
Offering automated property valuations via the Internet turns out to be much harder than it seems -- especially if you expect them to be accurate. But after running extensive tests on this ambitious national real estate service, I found it to be so inaccurate that it's not useful.
The founder, Lloyd Frink, fully acknowledges the problems, but believes more information is better. It can only help, he argues, to give people more information in the confusing home buying or selling process.
Here's the problem (one I've run into many times in the world of analytics): if you present something with precision, your audience will believe your numbers are accurate. Particularly if you are backing it up with language like:
We compute this figure by taking zillions of data points — much of this data is public — and entering them into a formula...[it] is incredibly robust and sophisticated...Hundreds of home details feed into the formula and the home characteristics are given different weights according to their influence in a given geography and over a specific period of time.
There is a related phenomenon in software development -- The Iceberg Secret -- described by Joel Spolsky:
If you show a nonprogrammer a screen which has a user interface which is 100% beautiful, they will think the program is almost done.
If the front end looks nice, most people assume everything behind the scenes works well.
I feel for the statisticians at Zillow. Creating a database with a majority of home values within 10 or 20% of reality is a monumental task. Unfortunately, even that isn't good enough. It doesn't take many wildly inaccurate estimates to undermine the credibility of the whole tool.
I'm reminded of a story passed around in the consulting business: Imagine sitting down in your seat on a flight and noticing that the seat belt sign above your head doesn't work. The fact that some little light isn't working doesn't imply there is anything wrong with the airplane's engines, navigation system or anything that truly could impact your likelihood of arriving at your destination. But that little failure can make you nervous.
9 comments | Show all comments only the last 5 are shown
precision is notthesame as accuracy said:
The difference between precision and accuracy is actually quite simple to explain using archery and a bullseye target as an illustration. If I can put all my arrows into the same point every time, I am very precise. However, I am only accurate if that point happens to be the bullseye. If not, then I am just precise but not accurate. If I am scatttered, I am neither precise nor accurate. That's all there is to it. Accuracy and precision are two different things,mutually exclusive.
Chris said:
I like the analogy.
The difference between these two concepts can be confusing (if you watch my latest screencast, I say them backwards a number of times). The point of Zach's post is that if you report numbers with high precision, you may mislead people into thinking those numbers are accurate.
For instance, imagine you come back from the archery range and say, "I put all my arrows in a three inch diameter". I might be misled to think that you are an accurate archer. Your statement doesn't guarantee that.
HomePriceMaps said:
If you checked out Zillow and weren't happy with their tax assessed "zestimates" check out <a href="http://www.HomePriceMaps.com" rel="nofollow">www.HomePriceMaps.com</a>
HomePriceMaps.com integrates Home Sale prices pulled from public records with google maps.
Bill Williamson said:
Million dollar homes for $500,000. Where do I buy one of these?
Looks like all they did was take publicly available tax records and then calculate a "market" price based on the taxing athorities formula. Older homes tend to be treated more favorably for property taxes than ones recently purchased. You certainly couldn't figure out what to pay for a home, or what your selling price should be, using this information. Using Zillow's numbers I can see that what you need to do is buy a very expensive older home that hasn't been sold very often.
I'll take a pass on this site.
bill said:
We can simply describe all the zestimates,zindexes or what not as simply Voodoo real estate - with a touch of voodoo economics.
Glyn Morgan said:
This site is totslly inaccurate.It has my house down as 1120 sq ft 2bed 1 1/2 bath when in actual fact it is 2300 sq ft 4 bed and two full baths.Both my neighbors have smalle houses than mine but show up as bigger.I tried to contact them about this but they totally ignore my e-mails.I have tried to change but it just doesnt happen.
Frank said:
I think their numbers are skewed. I've owned my home for over 10 years, so I have low property taxes even thought I've remodeled extensively. The house next door sold 2 years ago and is 25% smaller than mine and not remodeled. Zillow shows them as being worth about 10% more than mine. A block away, a home sold late 2005 as a tear down. Half the size of mine, but Zillow values it as about 25% more than mine. Great for some snooping as long as you don't put too much faith in the numbers.
Solver said:
Speaking of false precision, the numbers that go into the "Failed States Index" come from an automated new-analysis program that seems to do some amazing things:
<i>"Using Boolean logic, the CAST software analyzes tens of thousands of articles and reports to determine the relationship of the content to the indicators and to the core institutions."</i> (from <a href="http://www.fundforpeace.org/programs/fsi/castsoftware.php" rel="nofollow">Fund for Peace</a>
Ray said:
Zillow listed my home as 2 brd 1 bath, Igt hasn't been that since the early 80's. It is three bed 2 bath! Zillow GET YOUR ACT TOGETHER. This causes problem when people refinance, sell, etc.
Add a comment
Budget Rent a Car Commits Brand Suicide; Did Analytics Supply the Gun?
By Chris Gemignani
February 14, 2006
Find more about:
analytics
Ripped from the headlines:
To help offset gasoline prices, Budget Rent a Car is imposing an additional $9.50 charge on all vehicles driven fewer than 75 miles..."
"The new charge is aimed at renters who drive short distances and don't fill up their tanks before they return because the gas gauge still reads "full," even though the tank is a few gallons short. In the past, Budget filled the tank and billed the customer the highest rate. But now, Budget will impose the $9.50 charge even if the renter tops off the tank before returning the car. The charge will be removed only if customers show their gas receipt to a Budget agent, one traveler has already reported, slowing travelers often rushing to catch flights."
"This is a convenience and time-saver for our customers," said Susan McGowan, a spokeswoman for Cendant Corp., Budget's parent company. "This is being done to recoup the cost of lost fuel."
Tom Asacker's definition of brand is "the expectation of someone or something delivering a certain feeling by way of an experience." What feelings are Budget customers going to have about their experience? Four-letter feelings.
Budget's mis-step here feels like analytics gone wrong--a case where a spreadsheet exercise say "go, go, go!" while any sensible person would say "stop!". As we wrote earlier today, focusing excessively on analytics means you focus less on customer service, innovation, branding.
10 Ways Not to Build an Analytics-based Business
By Zach Gemignani
February 14, 2006
Find more about:
analytics
Thomas Davenport published an article in Harvard Business Review entitled "Competing on Analytics." He concludes the article with a checklist of ten key points he feels are important to creating a analytics-based business.
We disagree with quite a few of these points and even where we agree, we want add real-world nuance.
The challenge of analytics is communication and creating a shared understanding. It's about focusing on high impact areas, moving forward one step at a time, being skeptical, being creative, searching for the truth. Any company can compete on analytics, and you certainly don't need to satisfy a checklist to do so.
Here's Davenport's checklist, with Juice commentary. We're putting together a list of practical steps anyone can take.
1. You apply sophisticated information systems and rigorous analysis not only to your core capability but also to a range of functions as varied as marketing and human resources.
Analytics is hard. Analytics takes resources. It takes effort for an organization to create and assimilate learnings from analytics. You need to focus your analytics at the key leverage points of your business. As Davenport points out in the HBR article, UPS focuses their analytics on knowing where packages are, Marriott focuses on revenue management. If you try to do everything, you won't do anything well.
2. Your senior executive team not only recognizes the importance of analytics capabilities but also makes their development and maintenance a primary focus.
Of course analytics are good. But so is branding, innovation, operational excellence, customer focus. Companies are defined by what they don't do just as much as what they do. If you're going to make analytics a primary focus, you will need to make sacrifices elsewhere. Which of the above are you willing to de-emphasize?
Capital One, oft cited as the credit card king of analytics, aren't customer service champions nor are they particularly innovative.
3. You treat fact-based decision making not only as a best practice but also as a part of the culture that’s constantly emphasized and communicated by senior executives.
This is hard to argue with. However, it's easier said than done. In our experience, getting to a culture of decision making requires your business to have real, solid wins using analytics to make people care from top to bottom.
4. You hire not only people with analytical skills but a lot of people with the very best analytical skills—and consider them a key to your success.
The problems raised by the Mythical Man Month apply to analytics. Just as doubling the number of programmers on a project won't halve the time it takes to complete a project, doubling the number of analysts won't make your company twice as smart.
What you need are well placed and versatile analysts - analysts that are in constant communication and debate with key decision makers.
5. You not only employ analytics in almost every function and department but also consider it so strategically important that you manage it at the enterprise level.
What does this mean?
One thought: This refers to having a Chief (Analytics|Knowledge|Data) Officer. This may be a good idea. Here's an interesting interview with Usama Fayyed, Yahoo's Chief Data Officer about the value of having a chief data herder at a data intensive company.
If, on the other hand, this means centralizing analytics and building a single data warehouse, we disagree. For most companies, building a big "atomic baloney slicer" for analytics is not going to work out. These approaches take too long, are inflexible, and don't adapt to your business.
6. You not only are expert at number crunching but also invent proprietary metrics for use in key business processes.
Why is "proprietary" a good thing? What you do want is to develop a few metrics which are core to the success of your business. If you are in a well established industry, it's likely those metrics have been defined and are well understood. There's a lot of value in well understood metrics that everyone in your business understands. The challenge with analytics is communication and creating a shared understanding.
7. You not only use copious data and in-house analysis but also share them with customers and suppliers.
Insight is not measured by volume. As for sharing with customers and suppliers, it's a rare company that has evolved that far (e.g. Toyota). Focus analytics where you have the most leverage to change your business.
8. You not only avidly consume data but also seize every opportunity to generate information, creating a “test and learn” culture based on numerous small experiments.
There's lots of ways to build insight from data. It can be test and learn, it can be customer visualization, it can be scoring systems.
9. You not only have committed to competing on analytics but also have been building your capabilities for several years.
Yes. Analytics is a learning process - a journey, not a destination. The best companies have been working on learning for a long time. You can compete on analytics without having worked on it for years. Just get started.
10. You not only emphasize the importance of analytics internally but also make quantitative capabilities part of your company’s story, to be shared in the annual report and in discussions with financial analysts.
You risk hypocricy if you follow this advice. Culture starts with internal stories. External stories will arise naturally and organically from internal stories. If you focus on external stories the best you can hope for is to find yourself in a Harvard Business Review article.
10 comments | Show all comments only the last 5 are shown
Jim Novo said:
Sorry Juice, I ain't buying what you are selling.
Most of the 10 counterpoints made above focus on one of three things:
1. Reporting versus analysis - reporting and reacting, reporting and reacting, (repeat cycle) often does not result in "root cause", real analysis.
Let's say marketing reporting shows customers generated by a certain campaign are of low quality. Marketing starts tweaking the campaign so it generates higher quality customers, but it doesn't work; they waste a lot of time and money.
Over in customer service, they are doing their own reporting showing that this same campaign generates a ton of customer service problems because of the way it is worded; they use this reporting to defend additional requests for staff.
This is analytical failure due to "reporting" without any real analysis. Both silos waste time and money, and nobody gets to "root cause" because there has been no true analysis. Low value customers continue to be generated and costs spiral up. I've seen this exact scenario repeated over and over.
2. Micro versus macro analysis - if a silo wants to keep an analytical "lead" in it's own little box to do the navel-gazing, silo-focused analysis that impacts it's own little box, then that's OK. Just know that this analysis, while meaningful to the little box, cannot be used or trusted anywhere else in the company and so is of very little value in a macro way. But it's safe; the silo can proceed with the $10 "micro tweaks" and have full accountability while the competititon is making macro process changes worth millions using centralized analytics.
3. Centralizing analytics will be "hard" - Sure, change is hard. New thinking is hard. Staying inside the box is easy; silo thinking is easy.
Reporting on one's own little domain so one can control their own accountability is not only a structurally weak approach prone to data torture, it is also wildly inefficient from a corporate perspective.
How is it possible to "focus your analytics at the key leverage points of your business" when the analytics are coming from a micro perspective, or worse, are not really analytics at all but simply silo-based reporting? Meaningful, truly impactful "leverage" comes with cross-functional analysis, not silo-based reporting.
Jim
Zach said:
Jim,
Thanks for your comments. I doubt we disagree on the application of analytics as much as it may appear. I've enjoyed your web site and newsletter and know that you are very interested in practical approaches to making better decisions. Let me see if I can give a fuller view of our perspective in response to your comments:
1. Reporting versus analysis.
Coincidentally, I was just working on a blog post about the over-emphasis on reporting (executive dashboards, KPIs, balances scorecards, etc.) vs. digging into data to understand root causes and finding opportunities. As far as we're concerned, reporting should be focused on situations where a process or system is well understood and "under control." I agree with you that hypothesis-driven analysis is far more likely to lead to substantial innovations than gazing at the weekly report for your division.
2. Micro versus macro analysis
While we didn't mean to imply that silo'd analytics is the way to go -- I think every organization faces an important balancing act. On the one hand, centralized analytics ensures a multi-faceted view of the problem (to your point about marketing vs. customer service misalignments). This is critical to fully understanding your business. At the same time, there are sacrifices to centralizing analysts. First, they will lack a deep understanding of the problems. Many times I've seen lack of intimacy with the data, processes, and unique issues of a particular business area undermine the success of an analysis. More importantly, good analytics is an evolution of thinking and deciding. We've seen much more success when an executive responsible for something has analysts nearby who can help them make data-driven decisions. The typical asymmetric communication -- i.e. presenting a fully-baked analysis/recommendations to executives -- is far less effective than the continuous, informal questioning and answering between managers and analysts. All that said, I'm not sure certain about the precise balance that works best.
3. Centralizing analytics will be "hard" - Sure, change is hard. New thinking is hard. Staying inside the box is easy; silo thinking is easy.
Here's what we are reacting to: companies recognize the need for more data-driven decision making then embark on a crusade to make it happen. They bring in a technology-focused consulting company which promises to build them the business intelligence system to end all BI systems. This approach is: 1) too big; 2) too slow; 3) too expensive; and 4) neglectful of the organizational mindsets that have to change. I know from your work that you recognizes that analytics isn't about the tools. I just haven't seen a successful version of this "big bang" approach in any of the companies I have worked with.
Thanks again for your comment. I'd love to hear more of your thoughts.
Chris said:
Jim,
You might not be buying what we're selling, but I don't think you're reading what we're writing either.
I prefer the tone and scope of your more reasoned argument at http://insideanalytics.blogspot.com/2006/02/research-competing-on-analytics.html#comment-114035810728586370 and recommend anyone who's gotten this far to check out the thread at insideanalytics.
<blockquote>
"The VP marketing presents analytics based on the company having 10,000 customers. The VP product area presents analytics based on the company having 11,500 customers. The VP customer service presents analytics based on the company having 9,500 customers..."
"If corporate life & death decisions are being made based on the analytics, the above situation is outrageous, deadly."
</blockquote>
Sure, there are cases where analytic discrepancies can be dreadful (M&A for instance). We believe that many times plurality of results is real and helpful rather than harmful. Most of the time, analytics is not about drama (life and death decisions!), but about understanding enough to make a directionally correct decision. "Siloed" analytics--now that's a loaded term--can be about people better understanding their world. It's cynical to believe that the reason people would resist centralized analytics is to evade responsibility.
Nishith said:
I've been reading and then re-reading the discussion here and have been trying to reconcile the two opposing view points.
While I am inclined to go with Zach's views on how Analytics ought (not) to be done, I also agree with some of the points raised by Tom in his article and post.
If I visualize a mature organization that has been doing analytics for years now it is quite likely that what they have are multiple silos that do not collaborate or cooperate. In such an organization, probably the CEO or the CFO would (and should) stand up one day and force the silos to merge by aligning their strategic objectives into a coherent organizational strategic objective. The end result would probably be some kind of a Analytics Center of Excellence, which makes sense.
On the flip side, if I look at another organization that does not have such a long history of analytics (or has never done it), then getting the CEO/CFO to set up a single team and force BI downwards onto the businesses might be a recipe for disaster. The business heads would feel threatened by a change that they do not understand and that is taking place outside of their control. Such an initiative is very likely to get sabotaged and die a silent death after sometime. It might be better for such an organization to first do Analytics at a departmental level, and then merge them into an organization wide initiative when there is momentum and demonstrated value.
Maybe both the approaches are valid, and we need to choose one depending on who we are looking at.
Zach said:
I just ran across a very similar discussion over here: http://customer.corante.com/archives/2005/10/30/competing_on_analytics.php
Neil Raden also seems to have a similar queasy feeling about this high-level, out-of-touch analytics talk. He wrote up a compete retort to Davenport (http://www.hiredbrains.com/davenport01.htm). I like this point in particular: "as an organization becomes more “agile,” which is a definite trend, decision-making, even for the big decisions, will become more decentralized. Imagine how difficult it will be to buy or sell pieces of a company if the “brain,” the centralized analytical capability, stays with the parent and there is no local expertise?"
Priya said:
At the end of the day, Analytics is "decision support" function. It can only SUPPORT decisions/ not support them on the basis of data. While one can find "Insights" hidden in numbers, its not simple to find them - as it is critical for an analyst to also understand the market reality like a sales/ marketing person does. Putting together the jigsaw pieces is a function of numbers+gut-feel+ pulse of my customer (which comes from my front end teams).
Zach said:
I agree with the notion of analytics is a part of a jigsaw puzzle as we try to piece together reality. And there are definitely limitations in the ability to drive decision-making through data.
That said, I'm a little uncomfortable with the notion of "decision-support" function. I think most companies place analytics out to the side as a separate shared service to inform decisions -- then are happy to disregard the data when it contradicts their "gut-feel" or "pulse of the customer." We have run into many instances where those gut-feels are really just deeply ingrained and untested assumptions about how the business works. Faulty assumptions can become a crutch that decision-makers lean on way too much.
Ajay Kelkar said:
This is a fascinating discussion. As an executive who is in the thick of trying to make analytics happen in a leading Indian bank ,I can empathize with almost all the comments being made. My only addition would be that in my view,to truly drive this capability ,you need to invest serious $ in Change management. Without changes in process,incentive and structures..competing in analytics would be an impossible hurdle!!
» What about the (Analysis) Grunts? - Juice Analytics said:
[...] Meanwhile, Davenport minion Jim Novo responded to our criticism by stating: “if a silo wants to keep an analytical “lead” in it’s own little box to do the navel-gazing, silo-focused analysis that impacts it’s own little box, then that’s OK. Just know that this analysis, while meaningful to the little box, cannot be used or trusted anywhere else in the company and so is of very little value in a macro way. But it’s safe; the silo can proceed with the $10 “micro tweaks” and have full accountability while the competition is making macro process changes worth millions using centralized analytics.” [...]
Competing on Analytics Webinar with Tom Davenport today at 1:00 PM EST | Open Source Analytics said:
[...] Tom’s HBR article titled “Competing on Analytics” is based on his profiling of early adopters of Analytics that compete today based on data driven strategies. The research is also expected to be published in a book format in spring 2007. Tom’s article led to a fairly strong debate in the blogger community. Some see it as learning from the successful experiences, while others point out that while it may be true for the organizations that have been doing analytics for long, it may not factor in some other realities. You can see some of the interesting discussions here: Juice Analytics: 10 Ways Not to Build an Analytics-based Business, Juice Analytics: The Heart of the “Competing on Analytics” Matter, and Neil Raden: Power to the People: Analytics for the Masses. [...]
Add a comment
Know your customers
By Zach Gemignani
February 12, 2006
Find more about:
customeranalytics
marketing
The best businesses connect with their customers. They build intimate relationships, learn, and extend their products using this knowledge. After Apple learned that customers were using iPods to save addresses and data, they incorporated this feature into their next release. Intuit heard their small business customers saying, “I need to keep the books without the complexities of accounting” and QuickBooks was born.
Many companies have a different story. For them, technology has been a killer app—it’s killed the ability of individuals in the company to see their customers as individuals. Customers are a list to be manipulated, a total in a spreadsheet. They aren’t seen as people, much less as potential innovators. Dependence on big information systems is a source of the problem. These technology solutions are built to be comprehensive; built for speed; built for anywhere, anytime access. They aren’t built to understand individuals one at a time.
Sometimes the inability to understand customers stems from a business' impatience and short-term focus on ROI. Tom Asacker pulls out an early marketing guru to make his point:
Abraham Lincoln on chopping down a tree: "If I had six hours to chop down a tree, I'd spend the first four hours sharpening the axe".
Instead, what do most marketers do? They take a whack at the tree, put down the axe, measure the cut, pick up the axe, whack the tree in a different spot, and repeat ad nauseum. Exhausting, to say the least.
If you are in an information-rich business with many customer interactions—you can know your customers intimately. You can look at individual customer behaviors and start to recognize important and startling patterns. It will take some time, but Abe would say it is time well spent.
Visualizing dynamic data
By Zach Gemignani
February 9, 2006
Find more about:
reporting
visualization
Here's a common problem we run into: An organization wants to understand the dynamics of their customers as they interact with marketing channels, change products, and move between active and inactive. Presenting this type of information is tricky and cumbersome, despite the light it can shed on how a business works.
In systems dynamics-speak, this is the world of "stocks and flows." As entities move through a system, they are either in a particular stock (aka bucket, status, state of being) or flowing to another stock. By measuring the speed of flows and levels of the stocks, you can begin to understand how to manage and optimize a system.
For us, the challenge is in finding an elegant way to visualize this dynamic data. I haven't seen an easy or established way to handle this problem. Excel isn't very good at it (though we wrote about how it can do the job if pressed). Here are a couple more examples of ways we to tackled the problem:
- You can show stocks and flows in a simple and intuitive way if you are willing to constrain yourself to a couple snapshots in time. The graphic below was a way we displayed the inflow, outflow, and flow between products for a client. The visual language is straightforward: size of balls represents the number of customers, size of arrows shows the magnitude of the flows.

- On another project, we tried something completely different: we created a movie (Windows Media only) of the movement of customers into, within, and out of the business. To make the movie, we represented each customer as a point, then took daily snapshots of each customers' "location" (with a little extra marching between locations to make the flows come alive). It was a fun way to show dynamic data, if nothing else.
These were each custom solutions. I've been looking around for analytical tools that address this problem. No luck. Here's a few interesting things I found along the way:
- Visitorville is a web analytics tool that shows data in the context of a virtual city with people (site visitors) moving around between buildings (web pages).
- Information Aesthetics is a great blog to see innovative examples of data visualization. In this post, a reference to Chaomei Chen, information visualization guru and his Top Ten Unsolved Information Visualization Problems. "Number 8: paradigm shift from structures to dynamics: towards time-varying datasets, data streams & immediate trend-detection"
- Processing is "an open source programming language and environment for people who want to program images, animation, and sound." We've played around with the idea of using this as a way to visualize dynamic data.
- Systems dynamic software like iThink provides mechanisms for modeling stocks and flows. In my experience, these packages are more about creating simulations rather than reporting of historical information.
2 comments
Kelly O'Day said:
Your post reminded me of some dynamic - interactive charting that I have done with Excel. I have posted a short video onmy site that shows how you can add drop down box to chooose what toplot and zomm and scroll capabiliteis to enhnace the visua analysis capabilities of an Excel chart.
Take a look at the video and workbook.
http://processtrends.com/Files/Interactive_dynamic_excel_charts_wmv.zip
Thanks and keep up the good work...Kelly
koday@processtrends.com
mb said:
One thing I was looking at doing for a job was using Python, PyGame, and the Ming library [1] to create movies that showed data over time. My goal was to take data from web server logs and display a day/week/month's activity.
Once I had the prototype working correctly, I was going to use Blender's [2] Python scripting ability to create nice 3D renderings of the same thing.
[1] References to Ming library
http://www.libming.org/
http://en.wikipedia.org/wiki/Ming_library
[2] http://www.blender.org/
Add a comment
A Script for Misunderstanding
By Chris Gemignani
February 8, 2006
Find more about:
analytics
management
FADE IN:
EXT CENTRAL IOWA.
A CURVING COUNTRY ROAD. AT FIRST GLANCE
A TYPICAL RURAL SCENE, KNEE-HIGH CORN
RUSTLES IN ROLLING FIELDS. IN THE DISTANCE,
WE SEE CONSTRUCTION EQUIPMENT, WE NOW
SEE WE'RE AT THE JUNCTION BETWEEN RURAL
AND SUBURBAN, THE BEGINNINGS OF DEVELOPMENT.
A MINIVAN SWEEPS BY.
INT MINIVAN
CHRIS:
(lazily looking out window, spots a hay bale
in a trashcan, starts with surprise)
Hale of bay? Why are they throwing
out that hale of bay?
JENNIE
(puzzled)
Why's the "of" in it?
CHRIS
What are you talking about, "of in"?
JENNIE
Why are you calling me "oven"?
FADE OUT
That's a real conversation. Thankfully, my wife and I aren't verbally dysfunctional all the time. My personal pet peeve are meetings that exhibit a similar sort of verbal confusion. Does this sound familiar?
JIM
We need X.
AMY
You can't have Y.
JIM
X is really important.
AMY
We'll never be able to get Y done in time.
This is a great way to blow half an hour before Jim and Amy discover that they aren't even talking about the same thing.
This language barrier is particularly acute when business folks try to talk to IT folks. We've run into this problem a number of times. Here's a good conversation on the topic. No solutions today, just venting... and laughing.
1 comment
Mary said:
I've heard this story before and it always makes me laugh. It is so often true.
Add a comment
Geocoding tool roundup
By Chris Gemignani
February 8, 2006
Find more about:
excel
geocoding
googleearth
tools
Batchgeocode.com has put together a clean-looking front end to Yahoo's geocoding service if you need to map small numbers of addresses.
Here our Excel-based tool that performs similar duty; it takes a list of addreses, geocodes and maps them. We've used this tool to map thousands of names. It's quite a bit quicker than the batchgeocode web tool, though not as flexible in its mapping output. Enjoy.
Note: I've just posted an update of the Excel geocoding tool here.
4 comments
Jim Burnett said:
Excellent tool. Job very well done. We appreciate your work very much. Thanks!
Phillip said:
Very nice tool! Much faster than www.batchgeocode.com.
I noticed at the bottom left of the spreadsheet that it was "looking for" only the 5 digit zip when I put the zip+4 in the field. Does anyone know why the Yahoo search engine did not use the zip+4 data?
Alan said:
Awesome tool...thanks so much! This will really simplify my GIS mapping of epidemiologic data and greatly simplify my life.
Thanks again!
Chris said:
Phillip,
That was my code that restricts Yahoo search to only use the 5 digit zip. I've fixed it and will have a new version out today.
Thanks
Chris
Add a comment
After all these years, you still don't understand me
By Zach Gemignani
February 6, 2006
Find more about:
analytics
datamining
This past week, I caught an interview on The Daily Show with Robert O'Harrow Jr., author of "No Place to Hide." The book is a potentially frightening report on personal information collection by corporations and the federal government. Mr. O'Harrow offered a scary description of the dangers that await ordinary citizens caught in this shadowy experiment (e.g. jailed for a crime you have yet to commit, a la Minority Report).
As is typical, Jon Stewart asked a very insightful question (I paraphrase):
Amazon is always wrong with their recommendations; what makes us think that the government will be able to do anything with all this data?
That's precisely the question that comes to my mind when I hear stories of data collection. From what I've seen, gathering data is easy enough. It is making sense of the data that is hard. The challenge is to find relevant patterns of behavior, then determining causation with important outcomes.
Jeff Jonas, now chief scientist at IBM Entity Analytics, invented a data-mining technology used widely in the private sector and by the government. He sympathizes, he said, with an analyst facing an unknown threat who gathers enormous volumes of data "and says, 'There must be a secret in there.' "
But pattern matching, he argued, will not find it. Techniques that "look at people's behavior to predict terrorist intent," he said, "are so far from reaching the level of accuracy that's necessary that I see them as nothing but civil liberty infringement engines." - from Hagerstown Free Army blog, Intercepting Irony
Getting beyond gathering data to actual insight is a surprisingly common problem in the corporate world. There is a common progression that I've seen:
- company wants to be data-driven,
- company puts hooks into its customer-facing systems to gather data,
- data piles into data warehouse,
- first generation data warehouse proves unusable,
- new, better data warehouse is commissioned,
- new, better data warehouse comes online (a year and a few million dollars later)
- value of new datawarehouse is diminished by new business direction,
- money runs out for analytics projects
Even the companies that have the stamina to squeeze value from their customer data aren't quite as sophisticated as we imagine (fear?) them to be. The reigning king of data-driven decision making, Capital One, drops a credit card mailing to me on a weekly basis even though I haven't responded in 10 years.
Mr. O'Harrow is probably right to sound the alarms about what could be accomplished with the growing mounds of personal information -- or how personal data may be misinterpreted. That said, I'm skeptical that any organization, in particular the US government, is likely to effectively use such a big pile of data.
UI: Hot or Not?
By Juice Alumni
February 3, 2006
Find more about:
design
interface
The Wisdump blog recently did a design critique of Odeo. They made some good points but specifically thought that the sign-up form was too simple. 37signals did their own critique of the site but arrived at the opposite conclusion.
These are two intelligent and experienced teams (it's not like any schmuck straight out of college can get his own blog) with an above average sense of what makes a good user interface for a website. But they both saw the same site and disagreed. I think a big reason why this happens is that it's hard to separate the elements of design related to organization and the elements related to aesthetics.
Joel makes a point about this in his series on good design. However, I disagree with his point that aesthetics can only enhance a design and not take away from it (imagine if your Ipod was puke green). He's on the right track: good design is a two dimensional problem. One dimension is related to organization and engineering and the other is aesthetics.
Shouldn't the engineering aspect of it be more objective? If UI is engineering, than it should be more than just a variation of "hot or not".
One of the main elements that lead to good design is the issue of prominence. What parts of a website are the most eye catching to a user and what elements belong in those prominent locations. As I see it, there are four elements that go into how prominent something should be:
- Value to the user: How much does this feature enhance the users experience and interaction with the site?
- Value to the site: How much does the site need this feature to function properly?
- Simplicity: How simple is it for the user to learn or use this feature?
- Attractiveness or convenience: How much does this feature engage the user with the site?
Next step: Is there a way to quantify these factors in order to look at UI in a more subjective way?
The tantalizing feel of a good data-bashing tool
By Zach Gemignani
February 1, 2006
Find more about:
analytics
tools
We do a lot of data "bashing" around here. That's our preferred term for cleaning, manipulating, matching, and analyzing large chunks of data. Such a macho word is no doubt an attempt to compensate for our soft, white-collar work.
Even so, there is something to be said for tangibility in your work. I like the feeling of fashioning finely honed insights from the raw material of ones and zeros. Unfortunately, the tools for data bashing frequently lack a hands-on feel. (This desire for substance may be the reason why so many people are inclined toward 3-D charts.)
This desire for tangibility crystalized for me when I was introduced to a mapping tool by the folks at GeoWise. Their InstantAtlas geographic presentation software does a great job of making data easy to explore and tangible for the user. In the screenshot below, I can click towns (in red) and the data shows up in the graphs and map. That is tangible, easy-to-use data exploration.

Another data tool I like to use is called JMP. It is an intuitive and straightforward statistics/analysis package that handles larger data sets (south of 500k records). One of its best features is the ability to quickly see frequency distributions of variables - and relate them to distributions of other variables. In the image below, I've clicked on "very satisfied" in the first distribution (dark green) and I can see where those data points show up for the other survey categories.

Like InstantAtlas, you can pick a subset of data and see the characteristics of that data. This simple capability can make the sometimes dreary business of exploring numbers much more engaging.







0 comments | Add a comment
said: