Franken-measures...or How to Construct a Useful Composite Measure

Franken-measures

Sometimes a simple metric isn’t enough. It can’t fully describe a behavior or performance of a system. That’s when you need a Franken-measure: a made-up metric monster that creates a comprehensive composite to capture complex concepts.

Franken-measures go by many names—indexes, scales, ratings, composite or compound measures—and show up in all sorts of places:

Web analytics has an ongoing discussion about a measure of visitor engagement; the famous Google PageRank measures the “importance” of sites using a complex and mysterious algorithm.

Sports have embraced Franken-measures to evaluate player and team performance, e.g. passer ratings, Rating Percentage Index for college basketball, and judging of Olympic events like gymnastics, ski jumping, and ice dancing.

Economists loves indexes, e.g. Consumer Price Index, Consumer Confidence Index, Gross Happiness Index.

Marketers use “scores” to simplify their lives, e.g. Q scores measure the familiarity and appeal of popular culture entities and credit scores judge your value as human being.


Why would I want a Franken-measure?

You are probably already up to here with measures, so why would you want another one—much less one that is going to need extra effort and explanation? Here are a few things Franken-measures can offer:

A short-hand way to communicate about a complex concept. For example, a concept like customer loyalty may encompass everything from share-of-wallet to frequency of interactions to average sales amount.

A mechanism to operationalize a complex concept. Systems can take action on a single number more easily than an array of variables.

A definitive weighting of factors. Rather than constantly bickering about the relative importance of various measures, a Franken-measure can lock down the weighting, avoiding individual biases (in exchange for a systematic bias).

A balance of components. By combining multiple measures, variation in one measure doesn’t unduly bias the results.


What does it take to design an useful Franken-measure?

Not all Franken-measures are effective at achieving these benefits. There are at least four elements that contribute to a good design: completeness, concision, measurability, and independence. These factors can be combined into the Franken-measure Effectiveness Index (FEI) using Juice’s proprietary weighting model.

Completeness. Modeling all relevant performance factors to provide a holistic measurement of the concept.

Concision. A calculation that is as simple and straightfoward as possible, making it understandable and logical to users.

Measurability. Using direct performance data rather than relying too heavily on proxies or subjective measures. And from a practical perspective, if you can’t reliably gather valid data, the exercise is futile.

Independence. The components of the measure need to be independent so that variation in one component doesn’t directly drive another.


What can go wrong?

Finally, here are a few of the pitfalls to avoid when setting out to create your perfect Franken-measure:

Complexity. A complex calculation can confuse and infuriate your audience because it is hard to understanding what is driving performance and why the measure is moving. Leigh Steinberg, famous NFL agent, said of the NFL passer rating: “Other than one attorney in our office, I am unaware of a single human being who has the capacity to figure a quarterback rating.” The formula isn’t quite as inpenetrable as that, but it isn’t for the weak of heart:

passer rating

Changing the baseline. There will be inevitable pressure to change the franken-measure formula which automatically invalidates historical performance.

In search of comprehensiveness. A desire to be comprehensive can hamstring the effort. Take Eric T. Peterson’s Engagement Model. He is clearly striving for completeness but at the risk of feasibility, in my opinion.

Eric T. Peterson's engagement metric

Black box and credibility. For the people impacted by a Franken-measure, it is important to understand what is going on under the covers. And if it is impossible to share the algorithm or approach, credibility of the creator is all that remains. PageRank succeeds to the extend that people trust that Google has an objective, well-intentioned algorithm. A whiff of agenda or bias would undermine it in the eyes of the audience. Take the National Review’s “Liberal Rankings” which have managed to label the last two Democratic Presidential nominees as the “Most Liberal Senators.” Coincidences like that can undermine credibility.


For more information:

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

4 comments


April 13, 2008
Eric T. Peterson said:

Zach,

Got your email, thanks! I guess I understand what you're saying about Tufte's mastery of Adobe Illustrator but I suppose we'll have to agree to disagree on this point. Having done web analytics for a little while I have learned that there is simply no substitute for having the right tool for the job.

If you need to have excellent, beautiful graphs, you need to get AI and learn what Tufte already knows. If you need to make a nominally complex calculation based on multi-session visitor behavior, you need something more powerful than Google Analytics, HBX, or ClickTracks.

It's that simple.

Now, I certainly don't disagree with the vision of the engagement calculation available everywhere --- don't get me wrong! I'd love it if The Engagement Project were so successful that vendors large and small immediately deployed the metric as "standard" in their applications so that everyone could benefit from this new way of thinking ... but we're certainly not there yet so for the time being, visitor engagement (just like bounce rate, real visitor segmentation, and complex attribution models) will be available to some but not all.

Anyway, I hope to see you and Chris at Emetrics so we can continue the conversation. FYI, Carrabis, Gary Angel and I will be giving a presentation on this exact subject so hopefully you guys will be there to root us on.

All the best,

Eric T. Peterson
Web Analytics Demystified, Inc.
http://www.webanalyticsdemystified.com


April 13, 2008
Eric T. Peterson said:

Zach,

Interesting post. Thanks for including the engagement framework with other incredibly valuable and well known measures like PageRank, Consumer Confidence Index, and the all important Quarterback Rating!

Up until recently I would say "my framework is not worthy" but you may have noticed that Joseph Carrabis of NextStage Evolution has offered to help refine the mathematics to make it as complete, concise, measurable, and independent as possible. To this end we've established something I call "The Engagement Project" which we would love you guys to participate in if you're interested.

Our goal is to define a practical, extensible, and "extendable" measure of visitor engagement online, something as comprehensive as what I've described today yet mathematically precise. I love Joseph for this work as his credentials are impeccable.

One thing I suppose I do disagree with in your post above is the "feasibility" of the calculation I have described, but perhaps I don't understand what you're saying. Some folks have commented that they don't like my framework simply because you cannot make the calculation using Google Analytics, ClickTracks, etc.

I see this as kind of a weak argument --- there are obviously different levels of technology at our disposal today, some far more powerful than others. To say that this calculation/framework is impractical simply because a company doesn't have the right tool for the job is like complaining that you're unable to make a visually rich graph using a TI-81 calculator ...

The argument is similar to saying that "bounce rate" is impractical because a handful of popular applications still don't report on this very un-Franken-metric. Few would argue the utility of bounce rate, yet the feasibility of the metric depends 100% on which application you've deployed.

While the mathematics are getting a well-deserved refinement by Mr. Carrabis and others, the reality is that powerful tools like Visual Site, Coremetrics, WebTrends, and IndexTools are all capable of making the "Franken-measure" pretty much exactly as I have described it. Feasible, possible, and happening as we speak in some very large companies.

Anyway, I do consider it quite an honor to be cited in your blog and would be very excited if you guys would like to join Joseph and I in our work.

See you in San Francisco!

Sincerely,

Eric T. Peterson
Web Analytics Demystified, Inc.
http://www.webanalyticsdemystified.com


April 13, 2008
Jeff Hammerbacher said:

Hey Zach,

Great post as always. Having looked at composite measures in finance and now the web, I'd like to put extra emphasis on the "Black box and credibility" component.

It's important with a composite measure to get a good feel for its behavior under various states of the world. Having absolute transparency back to the data source for each measure is critical to develop this intuition.

I'd also add that producing easily understood examples of different factor levels and showing how they are scored by the composite measure will help develop intuition and confirm the utility of the single measure.

Regards,
Jeff


April 13, 2008
Zach said:

Eric, Thank you for the invitation to participate in "The Engagement Project." We'd love to be involved. I think it is great that there is some momentum behind this idea. I tackled it a few years back at AOL for some internal reporting and wondered at the time why the industry hadn't standardized. A little naive, I'm sure.
As for my comment about feasibility risk, I should probably wait to see how things evolve. However, I disagree with the assertion that the limitations of tools are unimportant. By analogy, the beautiful, multi-dimensional charts that Tufte likes to create in Adobe Illustrator are generally out of reach for ordinary analysts (both due to software and skills). His principles are valuable; his approach is impractical for everyday application. A central challenge for the engagement measure, in my view, is to find the right balance between reaching and persuading a broad audience while not sacrificing the core goals of the measure. I look forward to the continued discussion.
Jeff, I love your idea of demonstrating how different factor levels can impact the result.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment