Billions and Billions of Reports (a la Carl Sagan)

I recently came across a white paper on the "five styles of BI" and thought that would be an interesting read. As it turns out, more interesting than I expected. In this paper, the vendor (in order to protect the innocent, we’ll just call them MacroTactics) made a statement regarding the performance capacity of this particular vendor’s solution: 72,000 reports per hour. Let’s see, 72,000 reports per hour... that would be 576,000 reports in an 8 hour day... and 149,760,000 reports per year. Wow. Who’s reading that stuff?

Now, I fully buy in to the fact that applications that deal with lots and lots of data need to be hugely scalable, but what I don’t buy is how this is in any fashion a measure that anyone can use to figure out if a particular BI solution is right for them. I can just imagine the requirements spec for that solution: "15.1.182.f - Solution must be capable of creating 70,000 reports per hour. Alternately, solution will be able to generate 140,000,000 reports per year." 140 million reports! Incredible. (Now, what did I do with my mini-me?)

Seriously, here’s the thing. More reports is rarely the answer. We already have plenty of data and plenty of reports. What buyers and users really want is fewer reports and more information that helps them get their jobs done better and faster.

We’d encourage business intelligence vendors to think of themselves more as data storytellers than data factories churning out generic report widgets…even if they can do it at incredibly high speeds. From this perspective, you wouldn’t want to hear Steven Spielberg bragging about his ability to pump out a dozen movies a year or J.K. Rowling trumpeting her ability to write 1000 pages a year (hmm, wait a sec).

How to Feel Better About Your Data Warehouse Fiasco

Here’s a little predictive analytics:

About a year ago, I took a swipe at the “$80 million supercomputer to analyze NYC student achievement.” It smelled more like a super sales job than a super useful analytical tool.

At the time I had said:

Teachers are underpaid, hardly appreciated, and overworked. I can only wonder what the half-life is of a system that asks teachers to log on to get information delivered by the “chief accountability officer.”

Well, it appears that things haven’t gone that smoothly with the supercomputer. Today, I received a link from Leonie Haimson, a NYC education advocate, to a story entitled SCHOOLS COMPUTER AN $80M ‘DISASTER’.

Not only has the supercomputer struggled to gain much traction with users (“The school system’s new $80 million computer super system to track student performance has been a super debacle, teachers and principals say”), it has coincided with severe budget cuts.

We see these data warehousing problems all the time with our clients, and the NYC supercomputer displays all the hallmarks:

  • Delivery delays: Nearly six months after the Department of Education unveiled the “first of its kind” data-management system, the city’s 80,000 teachers have yet to log on because of glitches and delays.

  • Bad user experience: Many principals have complained that it runs slowly, lacks vital information, and is often too frustrating to use.

  • Complicated training and set-up: School officials were hoping to have everyone hooked up and trained within months delays in creating IDs and passwords for teachers
  • Trying to do too much, delivering too little: The principal added that she preferred to get student information from a combination of old data systems “rather than wait for ARIS to churn and churn and churn and maybe give me half the report I need.”
  • Massive cost: Complaints about the expensive system—on which nearly $35 million has been spent so far—have gotten louder since the city unceremoniously chopped $100 million from individual school budgets last month.
  • And yet, few success anecdotes to justify the investment: ARIS had already enabled her data team to analyze the performance trends of the school’s many English-language learners.

It does offer one thing that I haven’t seen before: a Chief Accountability Officer.

Microsoft says: “BI is Really Hard to Use”

We don’t tend to agree with Microsoft when it comes to data analysis and presentations. In fact, we’ve even been critical of them for misrepresenting data, excessive visual “flair”, missed opportunities to improve Excel, forgetting their power users, subpar presentation tools, and wasteful slide masters.

With all these past differences, I was a little surprised to find that we do share some common ground. Check out the comments (from an article in Internet News) by Peter Klein, CFO for Microsoft’s Business Division in describing the world of business intelligence:

“I’ve talked to a lot of customers about business intelligence and the one thing that they tell me is it’s really hard to use,” said Peter Klein, during at the Credit Suisse conference.

“‘I’m not getting the value out of the investment that I made,’” Klein said customers had complained. “‘I have invested a lot in my back-end systems, and today 10 percent or less of my employees actually touch it, or get access to the data. I’ve got six different BI solutions across multiple different departments, none of which talk to each other. And they’re hard to use, so I’ve got to send people to training for two weeks to learn how to use it.

Finally, we are speaking the same language. Now, I’m curious to see what they are going to do about it.

Do you have Insurgent Data?

There are two kinds of people in this world: those who put things into two categories and those who don’t. Maybe this isn’t the best representation of the complexities of the human race, but it does give me a cheap lead-in to compare two types of problem solutions: “high tech,” focused on tools, and “high touch,” focused on interpersonal communications.

I was reminded of these two approaches by a recent interesting article in Wired that expresses an opinion about why America’s performance in Iraq has been disappointing. The basic premise of this article is that America has entered into this engagement in a “technology networked" fashion, drowning it in technology; the more, the better.

The article suggests that the US forces would make more progress if they were to spend more time on a “socially networked" approach. For instance, instead of remote controlling a drone from 100 miles away, spend more time drinking chai with local leaders. Not the absence of technology, but the incorporation of technology into a socially based environment.

“If I know where the enemy is, I can kill it. My problem is I can’t connect with the local population.” This was a quote from one division commander. Change a couple of words and you end up with a statement that many of us would find all too familiar:

“If I know where the inefficiency is, I can fix it. My problem is I can’t connect with my data.”

Aren’t we witnessing this in spades right now in the BI space? There’s no lack of number of tools and number of features in these tools. The challenge is figuring out who the real insurgents are and how you deal with them. If you’ve been reading the Juice blog for very long, you have a pretty good feeling for how we approach what we believe is a social problem (high touch) and not a technical one (high tech).

The good news is that the US forces are changing their approach to socialize more with the Iraqi people—hopefully leading to a better Iraq. Is there good news for the BI space? We’d like to hear from you on how you’re making sure you focus enough on the social “high touch" aspects of our space. What’s your insurgent data? How can you get to know it better?

The Last Mile of Business Intelligence

“The last mile” is a term that often is applied in the telecom industry in reference to “the final leg of delivering connectivity from a communications provider to a customer.” It is an expensive and complex step due to the challenge of pushing information from centralized, high capacity channels to many diverse end-points where information is ultimately used.

We think there is a “last mile” problem in business intelligence too. This critical bridge between data warehouses and communication of insights to decision-makers is often weak or missing. Your investments and meticulous efforts to create a central infrastructure can become worthless without effective delivery to end-users. “But how about my reporting interface?” you wonder. That’s a creaky and narrow bridge to rely on for the last mile of business intelligence.


Listening to our clients, we are confident the last mile is a real problem. The ultimate source of this failure is less clear. Here are a few of theories:

1. The engineers who built the data warehouse build the interface. No offense to the talented individuals who can push around, clean, normalize, and integrate data—but they may not be ideally suited to designing a user interface for non-technical users. A designer wouldn’t create charts that look like this (our favorite example of chart-based encryption):

Chart-based encryption

In the worst case, developers are dismissive of user experience. I’ve met with IT folks who felt confident that providing a massive data table would provide a suitable solution for delivering information to users. “Hey, they’re getting their data. Is there a problem?”

2. Reporting is considered the fundamental mechanism for working with data. Here’s a framework we’ve started to consider in thinking through the multiple approaches for getting value from data:

Last mile triangle
  • Reporting lets you monitor things that are well-understood and relatively predictable.
  • Exporation or analysis helps you understand new processes and erratic and shifting behaviors.
  • Presentation is about communicating insights and understanding, often building on both reporting and analysis.

Many people assume that a reporting tool is sufficient to do in-depth analysis and communicate results. That’s like trying to build a deck with a screwdriver.

3. Poor fundamentals in information display. Despite the efforts of folks like Edward Tufte and Stephen Few, general literacy in this area is still low. Shiny, 3D pie charts are still acceptable, even desirable in some places. Particularly disturbing is the persistence and pervasiveness of this problem in Excel where there still remains some confusion as to why this is bad information display:

Excel data bars

You don’t have to go any further than the Dashboard Spy to find examples of the visual muck that is commonplace.

Centralized Confusion

Today’s post is brought to you by Andrew White of Gartner from an article intheir 2007 CRM conference brochure:

What’s the single biggest benefit of practicing MDM?

There are multiple drivers that help enterprises decide to embark on an MDM [1] program. Implementing a CDI-focused [2] MDM program will help implementations of CRM [3] achieve a higher return by enabling better cross-marketing and selling.

Implementing PIM [4] within MDM will help supply chains fulfill orders more timely [sic] and introduce new products more quickly. Embedding MDM in an SOA [5] environment contributes to business (process) agility through support of more rapidly developed composite applications; and others help cut costs by supporting better procurement practices.

Way to cut though to the heart of the issue, guys. Let’s see if we can decode what they’re saying:

Knowing more about your customers will help you find more products that existing customers want. It will help develop those products too. And let’s not forget your web apps. They’ll be easier to develop and easier for other companies to integrate with if you have your data well organized.

It’s nice to be able to decode this, but semantically, there’s nothing there. This response amounts to "Trust us, it’s great!"

[1] Master Data Management is another salvo in the eternal battle between centralization and decentralization in organizations. The wheel turns; today it’s MDM, in 5 years it will be called Centralized Metadata Integration.

[2] Customer Data Integration means centralizing how you track customer-related information

[3] Customer Relationship Management systems track interactions with your customers

[4] Product Information Management is CDI for products--see how easy this is getting?

[5] Service Oriented Architecture is a way of building computer services as little pieces rather than big integrated applications

What is Worse Than a “Super Mugging”?

I don’t know what you call it, but I know it when I see it. A couple months back I wrote about IBM’s sweet $80 million contract to develop ARIS (Achievement Reporting and Innovation System) for the New York City public schools. At the time I used some harsh words to describe this fleecing: swindle...preying on clients’ lack of expertise...Dr. Evil...wasted time and effort.

News comes to me from Leonie Haimson, Executive Director of Class Size Matters, that the $80 million price tag is, well, a starting point. She pointed me to a recent article that describes the creeping costs:

The education department’s new $80 million student-tracking computer system just got more expensive - and some parents are questioning whether that’s the best use of the money.

To ensure that children’s test scores and other private data don’t get into the wrong hands, the city began accepting bids this week from companies that specialize in safeguarding information, which experts say could add several million dollars to the system’s price.

"What’s not lost on parents of kids in overcrowded schools is that with the money being spent on this, we could build and staff several more schools," said Tim Johnson, president of the Chancellor’s Parent Advisory Council.

Parents are also wondering whether the system’s mounting cost is worth it - and why education officials didn’t anticipate the extra cost sooner. —New York Daily News

It does seem odd that a $80 million system wouldn’t come pretty well stocked with security, particularly from a blue-chip vendor like IBM. On top of that, Leonie hints at other costs that aren’t being directly counted toward the implementation of this system:

This initiative has mushroomed into a huge expense that threatens to overwhelm the entire school system, with all the SAFS, data inquiry teams, tests, and even the community district superintendents gobbled up to interpret and try to "coach" schools in the use of the massive data that will be spewed out. The DOE wants to charge much of this to the "contracts for excellence" and our CFE dividend, though it’s a real stretch to see if any of this falls under the specific programs outlined by the state.

Good luck to Leonie, Patrick Sullivan and the others who are stepping up to question this white elephant project.

The Google Analytics relaunch

Google Analytics has been rebuilt and the result redefines the frontiers of doing analytics on the web. Avinash Kaushik has the definitive early review.

Google Analytics v2

I had the privilege of attending the launch and playing with the early release. Here are a few things I noticed.

  • Speak my language: Google has put a lot of effort into replacing specialized terms with everyday ones. This makes the application usable by a broad base of people and is one way to fight GUI Jock-itis.
  • Speed kills: The interface is easily reconfigurable and fast. I’ve long argued that interface speed is a substitute for configuration options. I’m curious to play with the tool and get a better sense if this is true.
  • Flex rules: Much of the componentry for viewing data in Google Analytics is built in Adobe Flex. This is similar to Google Finance, and not at all like GMail or Google Reader, which use the GWT. We believe this has profound implications for analytical tools on the web and will dig into this in later posts.

    An $80 Million Super-Mugging

    Ah, the sweet smell of a swindle. Don’t you just hate it when consulting companies cajole deals with hand-wringing about technology and, especially, preying on clients’ lack of expertise?

    I’ve seen some of these situations up close but nothing so ugly as this story.

    $80 million supercomputer to analyze NYC student achievement

    March 6, 2007, 7:58 AM EST NEW YORK (AP) — To understand student performance, the city will spend $80 million on a massive supercomputer that will crunch huge amounts of data and offer up-to-the-minute reports to teachers, principals and eventually parents, the Daily News reported Tuesday.

    One million students and no high-volume transactional data? That might be huge to Dr. Evil but even by late 90’s standards that’s not huge. You want to talk huge? Now these are huge. The system that was sold to New York is more along the lines of a CRM system for a medium-sized insurance company.

    The "super" reference here is pure drive-through mentality. In the same way that we are a nation that’s overfed and undernourished, this is about a super-sized services contract that sits atop something that could be handled by a regular-sized computer.

    The information fed into the IBM-designed system called Aris, or "Achievement Reporting and Innovation System" could include existing data on students—such as gender, race and any disabilities—along with new data from incremental testing.

    Some aren’t so pleased with the system’s price tag.

    "You can lower a lot of class sizes with that money—or buy a lot of supplies," teachers union President Randi Weingarten said in a statement obtained by the Daily News.

    Mayor Michael Bloomberg told the tabloid the cost was worth it.

    "Every child in this city deserves a quality education and we will spare no expense," he said.

    This is where the sweet smell of swindle comes in. There is a difference between being willing to make the investment and having a no-bid contract.

    Jim Liebman, the Education Department’s chief accountability officer, also lauded the system.

    "Aris will bring together every bit of learning information that we have on every one of our 1.1 million students," Liebman said. "Now, school professionals will be able to slice and dice that data to see what’s wrong."

    Teachers are underpaid, hardly appreciated, and overworked. I can only wonder what the half-life is of a system that asks teachers to log on to get information delivered by the "chief accountability officer."

    And from an article in InformationWeek, we’re enthralled by a description of the system capabilities:

    "Think of a teacher trying to help a student struggling with geometry," says Michael Littlejohn, VP of public sector for IBM global services. "The teacher could tap into the system and search for best practices on geometry instruction, and get contact information for teachers identified as having strong skills in that area."

    Sometimes it’s good to reinvent the wheel - usually when you’re trying to learn about wheels. But not when you’re drawing away cash from an entity that doesn’t have it to spare. Something like this could be built with off-the-shelf, mature products for a fraction of this wasted time and effort.

    Sure, a fully-integrated, one-stop solution is going to run up the price but the functionality doesn’t sound particularly whiz-bang. Best practices for teaching geometry can be found at Curriki or Edutopia or Wikiversity or Openplanner or Learnamic.

    The real shame is not allowing such a system to connect more than just the overworked NYC school system teachers. But what would we call such a thing? An inter-net, perhaps?

    Nah, that would never catch on.

    Related articles

    Esurance–Competing on Analytics

    Recently I caught up with my college friend John Swigart who now runs the marketing organization at Esurance. When the conversation inevitably drifted to business, I asked about how Esurance was using data to make decisions. I was expected to hear the same old story—big failed data warehouse projects, piles of underutilized reports, frustration about not being able to understand how the business was performing. I was way off.

    It seems that John works for the rare company that has managed to live the analytics dream. Esurance competes on analytics—not in the idealistic model highlighted by Tom Davenport, whose "full-bore" analytics competitors are defined by:

    "Top management had announced that analytics was key to their strategies; they had multiple initiatives under way involving complex data and statistical analysis, and they managed analytical activity at the enterprise (not departmental) level...

    ...Employees hired for their expertise with numbers or trained to recognize their importance are armed with the best evidence and the best quantitative tools. As a result, they make the best decisions: big and small, every day, over and over and over."

    That’s window-dressing. John didn’t make any grandiose pronouncements of Esurance’s analytical achievement or talk of the best tools and most complicated models. He simply stated that data-based decision-making has been a part of the culture from the very beginning and he considers it essential to running a smart business. A few points that he emphasized:

    • Clear linkages between metrics. There needs to be a well-understood hierarchy that has important financial measures at the top (i.e. revenue) and connects to the underlying drivers.
    • Frequent reviews of reporting. Senior managers get together on a regular basis to look through the core reporting. These meetings are detailed, but somehow useful enough that people stay committed to the process.
    • Learning takes time. John recognized that Esurance cound not be as evolved in their understanding of the business without a commitment to this approach from the very beginning.

    After getting off the phone with John, I asked him to respond to a few questions so our readers could get a taste of their approach:

    How has Esurance managed to develop a culture that embraces decisions using data?

    We don’t make decisions based "I think we should this." We look at data to find out what we know, then decide what to do based on the facts. We identify expected outcomes up front and determine how we are going to measure the change before we implement something. Also, a data-driven culture starts at the top of our organization.

    What processes do you have in place to get the right data in front of the right people?

    We have centralized data warehouse and reporting structure. Everyone gets their data from the same place and the metrics are universal. This took 3-4 years to get it right, and we built it from scratch. It takes a substantial commitment to pull off.

    What is the role of the analyst in your organization? What tools do they use?

    We have technical analysts and DBAs in our business intelligence group that deal with the more technical issues. In Marketing, then, we have analysts how are on the individual marketing teams that work closely with the business people. The use some basic tools, nothing terribly fancy.

    From an analysis perspective, what do you do when you are testing new marketing opportunities?

    All tests are done with as much of a controlled environment as possible. With so many moving parts, this can be difficult, but is important.

    How has analytics contributed to the success of Esurance?

    Truly one of our competitive advantages. We would not be where we are today without great data and a dedication to using it through all levels of the organization.