1. Skip to navigation
  2. Skip to content
  3. Skip to sidebar

Our Blog



Twitter’s wild popularity hasn’t obscured the fact that the service needs to eventually make money. The concept of “Twitter analytics” as a revenue stream has come up often enough to make my ears itch and my nose burn.

Twitter’s new business development lead explains that the company is “developing a range of analytics and metrics products and services built around the information contained in tweets”…and “trying to figure out what are the appropriate metrics around engagement and how to convey those.”

Web Strategist Jeremiah Owyang raises the concept of a Twitter CRM solution, in which Twitter would offer their own analytics system to brands, that will help them to track and manage the conversations.

The Twitter ecosystem has responded with a wide range of tools for analysis of Twitter data. Web analytics behemoth Omniture recently announced the integration of Twitter data into their platform. At the same time, web analytics consultant Eric T. Peterson has been vigorously marketing Twitalyzer, a tool to evaluate individuals’ use of Twitter and metrics of influence. Google’s Chrome Experiments released a cool visualization tool called Social Collider that reveals cross-connections between conversations on Twitter. Here are a few more Twitter analytics tools that I’ve run across:

Despite all the activity, I haven’t yet seen a solution that offers the kind of valuable analytics that a company could use to understand the Twitter conversation relevant to their business. The applications above are either focused on the measurement of individual Twitter users or offer a high-level tracking of words and phases in the general conversation. They treat tweets as transactions — How many? How valuable? Who’s listening? Who’s responding?

To me, the great and more rewarding challenge in Twitter analytics is to synthesize the substance of those conversations. Imagine if you went to a party and could overhear everything that everyone else was saying. Who talked the most and who had the greatest audience is less interesting than what topics people were discussing and what was said.

I wanted to take a shot at this type of Twitter analytics.


Analysis Approach

First I had to define a particular domain or topic area. For expediency, I focused on all the tweets that included the word “analytics.” Using the Twitter search API, I pulled the first 500 tweets for each day in March and parsed the results to pull out users, urls, and other characteristics of the tweets.

To analyze the words and phrases being used, I uploaded the resulting 11,300 tweets into Concentrate, our search analytics tool. Concentrate is optimized for search query text (i.e. short phrases without a lot of punctuation). Nevertheless, it has a number of features that make text analysis easier, including breaking out the most common words, phrases and patterns. It also allows for filtering by words to create frequency statistics.

There were two main questions I wanted to address:

  1. What topics are people discussing?
  2. What is the structure of the conversation?

Topics of Conversation

The content of the Twitter conversation can be analyzed as words, sites/links, people/groups, and company/products.

Words

I used Concentrate to find the most common words, then I dumped those words into Many Eyes to create this “Wordle-brand” word cloud. Many Eyes has a nice feature that takes out the “common English words.” Clearly Google dominates the conversation, and I even had to artificially reduced the value to make the other words legible.

Word cloud

Below are the top 10 (non-common) words that show up in the analytics conversation

Top words

Twitter has become a mechanism for sharing interesting links (I’ll get to data on that in a bit). Looking at the most popular sites and specific links gives a sense for what people in this community are reading and talking about.

Top sites and links

People and Groups

Twitter users have a few conventions for connecting tweets to people or groups:

  • ”#” (i.e. hashtag) associates the message with associated with a group, topic or event.
  • “RT” (or “via”) is to repeat or “retweet” something someone else has said.
  • ”@” associates a tweet with another user, whether retweeting their message or directing a comment to them.

Here are the most common groups and people referenced in the Twitter data.

Top people and groups

And the people with the most tweets using the word “analytics”

Top talkers

Companies and Products

I was also interested in what companies or products were referred to most frequently. It is no surprise that Google dominates the conversation. Microsoft gets on the board with the recently closing of their adCenter product. I think we can safely assume they won’t be showing up that often in the future.

Top companies


Conversation Structure

Beyond the specific content of the conversation, I was also curious about how people who are talking about analytics tend to use Twitter.

Types of Tweets

Eric T. Peterson has four things he considers “signal” (versus “noise”) in the Twitter conversation:

  • References to other people (defined by the use of “@” followed by text)
  • Links to URLs you can visit (defined by the use of “http://” followed by text)
  • Hashtags you can explore and participate with (defined by the use of “#” followed by text)
  • Retweets of other people, passing along information (defined by the use of “rt”, “r/t/”, “retweet” or “via”)

While I’m not fond of this definition, examining these different types of tweets (along with question-based tweets) provides a good lens into the nature of the conversation. The following chart shows the percentage of tweets that fall into each of those categories.

Tweet Types

It would be all the more interesting if you could follow the types of tweets across time and compare against other topic areas. I suspect that the URL linking within Twitter is on the rise and is turning Twitter into a Delicious-style bookmark sharing service — without the functionality to save, tag, annotate, and view the bookmarks at your leisure.

Given all the sharing of links, I wanted to get a clearer picture of what happens when a link becomes popular. The graphic below shows some of the top links during the month and the amount they showed up in tweets by day. The red bars represent days where ten or more tweets included the link. A couple links demonstrated popularity over a week or so, but the rest sizzled then disappeared in a day or two.

Link Evolution

Activity Distribution

Finally, I took a look at the distribution of users by the number of tweets including the word “analytics.” It was no surprise that the vast majority of the 7,700 twitterers only used the word once in March (of course this doesn’t tell us about their other twittering activity). Obviously there is a small population of people at the core of the discussion.

Activity Distribution


While you’d have to go into more depth to answer detailed questions, there are a number of interesting take-aways for me, including:

  • “Analytics” means “web analytics”, not business intelligence or general reporting about sales, operations, or marketing.
  • Google Analytics is the star of the party. Of course, the fact that the brand name includes “analytics” is an advantage, but I didn’t see a giant “Juice” in the word cloud.
  • Twitter is an echo-chamber. The content clusters around particular subjects, with people retweeting and sharing links about the big news of the day. There are a dozen or so stories that dominated the conversation over this time period.

What’s next?

There are a lot more views of this data that could be enlightening for a company interested having a real-time understanding of their marketplace. For example, it would be interesting to provide more insight into:

  • Who is at the center of these conversations?
  • What is the positive or negative tone of the discussion (Twitter actually offers this information as part of their API)?
  • How has is the conversation changing over time?
  • What is the best way to define the boundaries of a domain-specific conversation?

These are the types of questions that I’d like to see addressed in a more complete Twitter analytics tool.




Topics:
, ,



This is a follow up to “Target Long Tail Searches with Keyword Patterns”

To get a sense of the scale of the long tail in search, Dustin Woodard recently put together an analysis of U.S. search data collected by Hitwise over a 3 month period, during which they measured 14 million different search terms. How did these break down?

  • Top 100 terms: 5.7% of the all search traffic
  • Top 500 terms: 8.9% of the all search traffic
  • Top 1,000 terms: 10.6% of the all search traffic
  • Top 10,000 terms: 18.5% of the all search traffic

This means if you had a monopoly over the top 1,000 search terms across all search engines (which is impossible), you’d still be missing out on 89.4% of all search traffic. There’s so much traffic in the tail it is hard to even comprehend. To illustrate, if search were represented by a tiny lizard with a one-inch head, the tail of that lizard would stretch for 221 miles.

Yesterday, we described the concept of search patterns and how you can use them to summarize this type of long tail text data. Today, we will walk through a case study we put together to explain how Concentrate’s pattern discovery feature will help you find new competitive insights.

You can replicate this study yourself by signing up for the Plus version of Concentrate and loading competitive search data from providers like Hitwise, Compete, Keyword Discovery, or comScore. The input search data used in our analysis consisted of a sample of unique queries leading to clicks on top travel domains during Spring 2006, along with their frequency of occurrence (the chart is truncated after the 20th query):

Raw search data: most frequent queries by site

unique search queries for travel sites

We loaded the full dataset of queries into Concentrate to generate summary patterns for each of 5 top travel sites. After each file of unique queries and associated metrics is loaded, the application generates reports which include summary statistics based on the head (top 50) and tail queries for each site. This is a good way to start looking at the data if we want to get a sense of each site’s long tail search strategy:

Head vs. tail queries for top travel sites

head vs tail for travel searches

It appears that the long tail makes up the overwhelming majority of traffic for the travel planning and review sites, but is a much smaller percentage for transaction focused sites like Expedia and Travelocity. Measuring the size of the head and tail gives us a rough idea what is going on, but we need to dig deeper if we want to benchmark where we stand in various categories and produce actionable insights. Inspired by a recent New York Times infographic “Words They Used”, our data visualization guru, Chris Gemignani, downloaded the Pattern CSV file that Concentrate generated for each of these sites and created the following view of competition in the travel search sector:

Comparing travel searches by pattern

long tail query patterns from Concentrate

This chart compares the proportion of searches that go to each travel site for the top 25 patterns in the travel sector. The site getting the most traffic for each pattern is highlighted. Only searches that wound up at one of these five travel sites are considered.

The difference in search pattern profiles for these sites is striking. Tripadvisor leads the pack in the long tail, which makes sense given the huge amount of long tail user generated content on the site. TripAdvisor owns most of the pattern categories, but Yahoo Travel and Hotel-Guides take the lead in niche areas like maps and hotels. Traffic to Expedia and Travelocity is largely composed of navigational and branded queries (not shown). The only long tail patterns they have significant share for are “[x] ticket”, and “cheap [x]“.

The input data we used reflects referrals to these sites from a sample population of users who clicked on search engine result pages. Factors which will affect the number and type of search referrals a site received in this data include: how representative the sample is of the population of U.S. searchers as a whole, how much relevant content a site has for a given query pattern, and how well that content ranks in google and other search engines.

If a travel website repeated this study with Concentrate using current competitive data, then uploaded additional search data for their own site including other metrics beyond search frequency (see our demo using Google Analytics), the results might reveal that “things to do in [x]” queries lead to high quality visits and their site has a chance at winning more searches for that pattern. Based on this information they might decide to make a move on TripAdvisor in that content category. Mark Jackson describes some strategies to apply within the travel sector in an article at Search Engine Watch:
Should Your SEO Strategy Target the Head or the Long Tail?. Using Concentrate, a travel website could streamline the process by downloading thousands of real queries for this pattern sent to their competitor:

Some queries in TripAdvisor pattern: “things to do in [x]“

long tail travel search pattern

Take Action: Some ideas for next steps

Topics:
, , , , , , , ,



Analyticstime!

Chris Gemignani

If you struggle to legitimize analytics within your organization, you can’t touch this video for a powerful explanation of the impact of analytics:

MC Hammer at the AlwaysOn/STVP Summit at Stanford, “Music Artists Go Entrepreneurial.” Around minute 24:00.

Topics:
, ,



Here at top-secret Juice headquarters, some major new products are in the works, and we want to promote them with Google’s revenue powerhouse (also known as Google AdWords). Thus, after three weeks of self-imposed AdWords boot camp, I have emerged with a few scrapes and burns, along with some tips that I wish I had been armed with since the beginning.

The natural place to start learning about Google AdWords is the official Help Center, an expansive and neatly categorized resource. But what happens if your inhuman schedule or dwindling coffee supplies don’t allow you the luxury of navigating through the help center hierarchy or sifting through its search results? While you might be able to maintain a semblance of a campaign without answering those lingering questions, you run a high risk of letting potential viewers slip away, never seeing your ad, and wasting money on high CPCs (cost-per-click).

You are hereby invited to learn from my mistakes. I am forgoing the usual basic topics in favor of questions whose answers are more time-consuming and tedious to find. It took me a few weeks to get comfortable with AdWords and figure out these answers myself, but it will only take you a few minutes!

Read on to learn the answers to:

  1. How creative should I be with my ad text?
  2. How do I find out what keywords my competitors are using?
  3. Why has Google’s heartless algorithm condemned my keyword as inactive?
  4. How do I get bolded words in my ad?
  5. What is dynamic keyword insertion, and how do I use it?
  6. What is the difference between a campaign and an ad group?
  7. What is the difference between keywords and placements?

1. How creative should I be with my ad text?

When I was but an AdWords newbie, I held the misconception that creative ads were all that I needed to pull in clicks. Pop psychologists might credit my right brain, starved for attention in the left brain’s home turf (programming! algorithms! programming these algorithms!), for seizing upon the opportunity to design some artistic and imaginative ad copy:


The “Viva la Revolucion” ad was my baby. But it turned out to have a face only a mother could love, as evidenced by the zero people who clicked on it. To the stunned disappointment of my right brain, Google AdWords is just as algorithm-fueled as any of Google’s other products. In fact, Google AdWords runs much like the ubiquitous search engine does, treating your keywords, ads, and landing page similar to the way it treats the 1 trillion pages it crawls while looking for content.

2. How do I find out what keywords my competitors are using?

Google won’t tell you—it’s in their privacy policy. But services such as KeywordSpy will. KeywordSpy not only gives you lists of your competitors’ (and your potential) keywords, but provides data for each keyword about other metrics, including as ROI, price per click, and number of competitors.

3. Why has Google’s heartless algorithm condemned my keyword as inactive?

Sometimes, Google will refuse to show ads for certain keywords unless you pay an absurdly large CPC. The large CPC is meant to discourage you from following any of these bad habits:

  • You dumped a lot of unrelated (or weakly related) keywords into one gigantic ad group.
  • Try making many smaller ad groups, each with its own tightly-connected set of keywords. Ideally, every keyword in a given ad group is a synonym for all the other keywords in the ad group. This also helps tremendously with writing ads that use dynamic keyword insertion (see question #5), since forcing ads to accommodate keywords covering a wide range of topics and/or parts of speech makes the ads vague and unspecific. To find keywords that deserve synonym status, use Google Sets. It’s like a thesaurus on steroids.

  • Your keyword, ads, and landing page aren’t “relevant” enough to each other.
  • All members of the Holy Trinity of content (keywords, ads, and landing page) need to draw from the same words to be considered related. Try making sure that they line up.

  • The cost per click you set for that keyword falls below the minimum.
  • This is the nicer way of saying that you have to spend more money.

4. How do I get bolded words in my ad?

You can’t designate specific words to be bolded (or formatted in any way, for that matter). You can, however, make sure to include keywords (words the user types in that you have selected for your ads) in your ad title and/or body. Just as it bolds keywords in search results, Google bolds keywords in ads. Your keywords do not have to be exact matches with the words in your ad. In the example below, a search for the keyword phrase “report automation” produces an ad that not only bolds “report” and “automation,” but also their variants “reports” and “automating.”

5. What is dynamic keyword insertion, and how do I use it?

This technique (sometimes known as “wildcards”) is how eBay and Target can pull off “Buy _____ now” for every conceivable adjective-noun combination. It allows you to make the same ad apply to multiple keywords. The format is:

The word immediately following the colon (no spaces) indicates the word you want to be shown when the keyword is too long to fit in the ad. Since I chose that word to be “executive dashboards,” the ad prompted by a too-long keyword would look like this:

Here is the same ad with other keywords swapped in, thanks to dynamic keyword insertion:



You can tweak the capitalization of the keyword with Google’s guidance, in the form of this handy table and more.

6. What is the difference between a campaign and an ad group?

A campaign is made up of one or more ad groups. Each campaign has one budget (i.e., $10/day) that is shared between all of its ad groups. Each ad group can be customized with different ad variations, keywords, placements, days and times the ad is shown, etc. Therefore, most modifying and experimenting happens on the ad group level.

7. What is the difference between keywords and placements?

Keywords produce what people usually think of when they think of Google AdWords. When a user performs a Google search for a keyword you have selected, your ad appears on the side (or top, if your budget is very generous) of the results page. Placements occur in the “content network,” which is made of individual sites that get paid to show Google ads. If you sign up for a lot of placements, you’ll get a lot of clicks—but only because of the sheer volume of people seeing your ad. In some ways, placements are less targeted than keywords because people who clicked on your ad in the content network aren’t actively searching, as they are when they find your ad through natural searches. There are two types of placements:

  1. Placements You Select
  2. Google’s Placement Tool allows you to browse a gigantic list of sites organized by topic. Any of these sites could have your ad on it. The Placement Tool will also suggest sites and break down your potential audience by demographic.

  3. Placements Google Selects
  4. Google will select sites in the content network based on information from your current campaign. These sites may make up the bulk of your impressions and clicks on the content network and in general (in other words, clicks from the Google’s selected placements may outnumber both clicks from your selected placements and clicks from organic searches).

This list is by no means a comprehensive examination of AdWords, but at least now you can consider yourself three weeks wiser and three weeks closer to writing one that is.

Topics:
,



Many analytical applications fail for a simple reason: they assume users know precisely what they need before they’ve begun the analysis. There are cases where this assumption holds and the user has a specific end-point in mind. But more often, users depend on the tool to track down an answer with only a vague idea of where to start. The exploratory analysis that follows can feel like swimming upstream when the application isn’t designed to facilitate the journey.

The source of this mismatch is partly rooted in the technical perspective of database developers. The simplest path to providing data access is to let users fill out a form to define a SQL query. It is a linear mindset that isn’t well-suited to ambiguous problems.

I’d like to offer a couple examples that illustrates the difference between the common, form-based approach and a more dynamic, interactive approach. Then I’ll explain the implicit assumptions behind the different models and why it matters.


At its heart, Travelocity is a travel analysis tool intended to help you find the best flight (or hotel, car rental, package, etc.) given a complex set of parameters. The relative importance of each of these parameters (departure day/time, return day/time, airports, connections, preferred airlines, price, etc.) is a personal preference… but not one that is explicitly or fully known even to the user. For example, it would be hard for me to say exactly how much more I would pay for a non-stop flight or what is the relative value of a more convenient airport versus a more reliable airline. These preferences are hard to understand prior to seeing specific trade-offs.

Travelocity approaches this complex problem in the way that so many analytical problems do: it asks for all your preferences first then offers a static list results for the specified query.

Travelocity Results

A few things to note about this search results page:

  1. On a busy web page, “Change Your Search” is not emphasized.
  2. The “tracker” across the top shows a linear five-step process. The user is expected to flow through this sequence in order.
  3. Getting results for a new search takes more than ten seconds.

I’ve been a loyal Travelocity user for years, and I don’t want to imply that this site is poorly designed or difficult to use. The problem is more subtle than that.

By way of comparison, let’s take a look at a more recent entrant to the online travel business, Kayak. This site is designed with a different usage model in mind. Kayak starts by asking for the same information as Travelocity, but the results pages is designed to support further analysis:

Kayak Results

The biggest difference is the prominent filtering functionality on the left side of the page. The filters allow users to narrow down their original search without leaving the results page (it takes less than a second to view refreshed results after changing a filter—no “run report” button required). In addition, Kayak places more emphasis on the start-over option. The designers of this site did not assume your first search would be enough to get you to the perfect flight option. Finally, notice the different “views” of the data that are available for a given result set. The views help support different types of decisions based on the same search parameters.


Analytical applications for business have similar underlying structures and usage models. The analysis process in Omniture SiteCatalyst, the leading web analytics platform for large sites, offers a typical example:

Omniture start page

This application offers lots of functionality, and it feels like featuring functionality is the primary purpose of the start page. If you want to get to useful data rather than view an advertisement for Omniture products and events, you can start by selecting the “Report Builder:”

Omniture form

Now, it is form-filling time. Like Travelocity, the user is expected to choose the precise parameters before they get to see anything. The resulting report requires a 10 second wait, and the result is static. Any additional filtering will require you to run a new report

Now let’s look at how Google Analytics chooses to structure the user experience:

Google Analtyics dashboard

In contrast to SiteCatalyst, Google Analytics shows you results immediately—no defining or configuring a report before you can get started. Similar to Kayak, the application offers a bunch of options on the report results page to refine parameters (e.g. data ranges, metrics, comparisons).


Travelocity and Omniture make a few assumptions common to analytical applications:

  • Users can accurately define their need (i.e. they already know what they are looking for).
  • Users can precisely define their need (i.e. they know all the relevant parameters).
  • Users’ workflow will follow a linear sequence of events. Going back to the beginning is a failure of the process or user.

More effective analytical applications like Kayak and Google Analytics make different assumptions:

  • Users have a general question, but do not necessarily know details about what they’re looking for.
  • Users need to see results before they can ask better, more detailed questions. These feedback loops provide critical learning.
  • Users need to get to data as quickly and easily as possible. A screen without data is delayed progress.
  • Different views of the data can provide different insights about results.
  • Users want the application to keep up with their trains of thought. Speed and responsiveness matter. Here’s a framework from Jakob Nielsen’s blog about response time:

0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.

1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.

10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.

In my experience, making the right assumptions about user behavior makes all the difference between an application people enjoy and depend on and an application people dread using.

Topics:
,



Updated October 21, 2009

Yesterday, Google released an update to their popular Google Trends tool. There are improvements over the previous version, but the biggest new feature is a new shiny button that lets you download all your data in the format of a CSV. This is a very cool enhancement. Where Google Trends was a geeky toy, it now takes the leap to integrate into analysts’ reports and with that, edge its way onto managerial desks.

This python module is a quasi-API to make it easier to authenticate into Google Trends for those who want to squeeze the extra level of functionality out of their data. The advantage of programmatic access is that the data can be automatically trended and merged. It can be snuck into a 9:00 AM daily email to the VP of Marketing so that she knows to ramp up Google Adwords campaigns for some specific keyword. Also, by programatically pulling multiple reports, it is possible to create a wealth of data not visible in a single report. Using one keyword as a benchmark to merge multiple reports, we can do a meaningful comparison on tens or hundreds of relevant keywords.

To use the pyGTrends, the quasi-Google-Trends-API, you can download the latest version from github.

Here is an example of the most basic basic report that you can pull down from Google Trends. The connector function needs authentication info, and download_report needs to be passed a list of keywords.

from pyGTrends import pyGTrends

connector = pyGTrends('google username','google password')
connector.download_report(('keyword1', 'keyword2'))
print connector.csv()

You can, however, use pyGTrends to get any slice of data that you can pull down from Google Trends. To see the exact parameters that you should use, go to Google Trends, and navigate to the specific sufficiently-narrow report that you are interested in. Then, right-click on the CSV download, and save the link location. The different parameters should be discernible from the link. The following code downloads a report for banana, bread, and bakery keywords from April 2008, originating from the magnificent nation of Austria, and scaled using fixed scaling (aka the second download link).

connector.download_report(('banana', 'bread', 'bakery'),
                          date='2008-4',
                          geo='AT',
                          scale=1)

By default, the csv() function downloads the main part of the report, but there are a few additional parts stuck to the bottom of the CSV file. If you are interested in those, pass the section parameter to the csv() function. The following will return the Language section.

print connector.csv(section='Language')

Full recommended usage includes using either the csv.reader or csv.DictReader module.

from csv import DictReader
print DictReader(connector.csv().split('\n'))

Here is a snapshot from the new Google Trends to add some eye-candy to the post:
Google Trends Eye-Candy

Topics:
, , ,



Franken-measures

Sometimes a simple metric isn’t enough. It can’t fully describe a behavior or performance of a system. That’s when you need a Franken-measure: a made-up metric monster that creates a comprehensive composite to capture complex concepts.

Franken-measures go by many names—indexes, scales, ratings, composite or compound measures—and show up in all sorts of places:

Web analytics has an ongoing discussion about a measure of visitor engagement; the famous Google PageRank measures the “importance” of sites using a complex and mysterious algorithm.

Sports have embraced Franken-measures to evaluate player and team performance, e.g. passer ratings, Rating Percentage Index for college basketball, and judging of Olympic events like gymnastics, ski jumping, and ice dancing.

Economists loves indexes, e.g. Consumer Price Index, Consumer Confidence Index, Gross Happiness Index.

Marketers use “scores” to simplify their lives, e.g. Q scores measure the familiarity and appeal of popular culture entities and credit scores judge your value as human being.


Why would I want a Franken-measure?

You are probably already up to here with measures, so why would you want another one—much less one that is going to need extra effort and explanation? Here are a few things Franken-measures can offer:

A short-hand way to communicate about a complex concept. For example, a concept like customer loyalty may encompass everything from share-of-wallet to frequency of interactions to average sales amount.

A mechanism to operationalize a complex concept. Systems can take action on a single number more easily than an array of variables.

A definitive weighting of factors. Rather than constantly bickering about the relative importance of various measures, a Franken-measure can lock down the weighting, avoiding individual biases (in exchange for a systematic bias).

A balance of components. By combining multiple measures, variation in one measure doesn’t unduly bias the results.


What does it take to design an useful Franken-measure?

Not all Franken-measures are effective at achieving these benefits. There are at least four elements that contribute to a good design: completeness, concision, measurability, and independence. These factors can be combined into the Franken-measure Effectiveness Index (FEI) using Juice’s proprietary weighting model.

Completeness. Modeling all relevant performance factors to provide a holistic measurement of the concept.

Concision. A calculation that is as simple and straightfoward as possible, making it understandable and logical to users.

Measurability. Using direct performance data rather than relying too heavily on proxies or subjective measures. And from a practical perspective, if you can’t reliably gather valid data, the exercise is futile.

Independence. The components of the measure need to be independent so that variation in one component doesn’t directly drive another.


What can go wrong?

Finally, here are a few of the pitfalls to avoid when setting out to create your perfect Franken-measure:

Complexity. A complex calculation can confuse and infuriate your audience because it is hard to understanding what is driving performance and why the measure is moving. Leigh Steinberg, famous NFL agent, said of the NFL passer rating: “Other than one attorney in our office, I am unaware of a single human being who has the capacity to figure a quarterback rating.” The formula isn’t quite as inpenetrable as that, but it isn’t for the weak of heart:

passer rating

Changing the baseline. There will be inevitable pressure to change the franken-measure formula which automatically invalidates historical performance.

In search of comprehensiveness. A desire to be comprehensive can hamstring the effort. Take Eric T. Peterson’s Engagement Model. He is clearly striving for completeness but at the risk of feasibility, in my opinion.

Eric T. Peterson’s engagement metric

Black box and credibility. For the people impacted by a Franken-measure, it is important to understand what is going on under the covers. And if it is impossible to share the algorithm or approach, credibility of the creator is all that remains. PageRank succeeds to the extend that people trust that Google has an objective, well-intentioned algorithm. A whiff of agenda or bias would undermine it in the eyes of the audience. Take the National Review’s “Liberal Rankings” which have managed to label the last two Democratic Presidential nominees as the “Most Liberal Senators.” Coincidences like that can undermine credibility.


For more information:

Topics:



Stephen Colbert has mentioned that he’s having trouble getting guests during the writer’s stike. We find this puzzling, given the supposed benefits of the Colbert Bump. Does being on the Colbert Show really provide a bump—a critical leap that vaults a writer, or a politician to superstardom?

We know that Colbert isn’t a big fan of “facts,” and only needs his gut to tell him the Colbert Bump is real. At Juice, we let the data decide what’s real or not, so our apologies to Stephen for not taking his word for it. Intrigued, Juice Analytics set out to find out the truth. We gathered data about Amazon sales rank for 20 authors that appeared on his show in recent months. How did those ranks change in the days immediately before and after the authors’ appearance on the show?

Amazon Sales Rank of Colbert Guests

Hmmm, there might be something there but those sales ranks don’t tell us much. Fortunately for Stephen, some “eggheads” have worked out roughly how Amazon sales rank corresponds to actual book sales. We calculated the sales, and normalized the data so that the week prior to appearing on the Colbert Report was equal to 1.0. Here’s a picture.

Projected Sales of Colbert Guests

That looks like a bump, Conan. In fact, being on the Colbert Report increases sales by 10 times on average. That bump doesn’t last forever, but, let’s face it, what does?

We also wanted to know, what kinds of books are Colbert’s audience going crazy for? After all, Colbert is well known as a rock-solid conservative. He’s tight with the Bush Administration. Even though he debates a few liberal (“pinko”) authors now and then, most of his guests are writers of pop-intellectual studies of the Gladwellian persuasion.

Here are the authors and how we categorized them:

Pinkos: Jessica Valenti, Full Frontal Feminism: A Young Woman’s Guide to Why Feminism Matters, Wesley K. Clark, A Time to Lead: For Duty, Honor and Country, Robert Shrum, No Excuses: Concessions of a Serial Campaigner

‘Publicans: Tom DeLay, No Retreat, No Surrender: One American’s Fight

Pop Essayists: Daniel Gilbert, Stumbling on Happiness, Daniel B. Smith, Muses, Madmen, and Prophets: Rethinking the History, Science, and Meaning of Auditory Hallucination, Michael Gershon, The Second Brain: A Groundbreaking New Understanding of Nervous Disorders of the Stomach and Intestine, John J. Mearsheimer, The Israel Lobby and U.S. Foreign Policy, Thomas L. Friedman, The World Is Flat: A Brief History of the Twenty-first Century, Frank J. Sulloway, Born to Rebel: Birth Order, Family Dynamics, and Creative Lives, Jared Diamond, Guns, Germs, and Steel: The Fates of Human Societies, Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable, Richard Preston, The Wild Trees: A Story of Passion and Daring, Malcolm Gladwell, Blink: The Power of Thinking Without Thinking, Bjorn Lomberg, Cool It: The Skeptical Environmentalist’s Guide to Global Warming, Andrew Keen, The Cult of the Amateur: How Today’s Internet is Killing Our Culture, Michael Wallis, The Lincoln Highway: Coast to Coast from Times Square to the Golden Gate

Popular: Stephen Colbert, I Am America (And So Can You!), John Grisham, Playing For Pizza: A Novel, Tina Brown, The Diana Chronicles

How much of a bump did each of these groups receive?

Colbert Bump by Category of Guests

It’s a shock! Liberals and high-minded eggheads do better than popular or conservative books. I’m not sure if Colbert knows this, but his audience isn’t who he thinks they are.

Here are all the authors and their normalized sales around the time of their appearance on the Colbert Report.

Valenti
Clark
Shrum
DeLay
Gilbert
Smith
Gershon
Mearsheimer
Friedman
Sulloway
Diamond
Taleb
Preston
Gladwell
Lomberg
Keen
Wallis
Colbert
Grisham
Brown

This post was a collaborative effort of the entire Juice team. Pete Skomoroch concocted the idea, wrote copy, and found the study linking Amazon Sales Rank to actual sales. Zach data mined. David May whipped up elegant, instant visualizations. Sal Uryasev munged data.

Topics:
, , ,



The TV ratings system is broken. Everyone knows it, but nobody wants to admit it. Nielsen ratings struggle to accurately measure audience quantity (limited tracking of DVR usage and online viewers) and quality (are viewers engaged? are they skipping the ads?). However, admitting so would undermine the delicate balance TV networks share with their advertisers.

I caught an interesting segment on KCRW’s “The Business” podcast about TV series that find themselves on the “bubble,” i.e. at risk of getting canceled. The producer of CBS’s Jericho, “a post-apocalyptic drama starring Skeet Ulrich” (shouldn’t that description alone put it on the chopping block?), explained how they received a temporary stay of execution when their small but loyal audience protested network plans to cancel show. The interview raised questions about the validity of Nielsen ratings and how an fervent online audience can bring additional perspective to the performance of a show.

All this talk of measurement gave me an itch to look at some real data. I tracked down the Nielsen audience size (Subscription required) for TV series over the 2006-2007 TV season. Then I pulled from comScore (a Juice client and leading source for data about Internet traffic and usage behaviors) the unique visitors and time spent on websites of TV shows over the same September to May time period.

I had a few questions I was curious about:

  1. Which shows have dispropotionately larger internet audiences—an indicator of a loyal and rabid fan base? Are there other shows like Jericho that struggle to build a large TV audience, but have a strong online following?
  2. Which TV show sites have the most engaged audiences?
  3. What TV networks have been most successful at building online traffic to their sites? Which types of shows spawn online audiences?

The table below shows the top 20 TV series by ratio of monthly unique website visitors to average TV viewership. This metric suggests an ability to get viewers to look for more content, whether it is additional video, information about the actors, or discussion boards. If Jericho’s 9.5 million TV viewers (tied for 48th overall) represents the proverbial bubble, there are eight other shows with bubble-level ratings that can also claim strong online support (highlighted in this list).

Ratings Table 1

I also wanted to get a sense as to the engagement of the online audience. Were people simply stopping by the website to check the TV schedule, or were they digging deep for more content? One measure that gets at this question is minutes per unique visitor. The top 20 websites are listed below. Interestingly, 12 of these sites are also found in the previous table. Jericho is one of four of the bad-Nielsen-ratings/strong-online-audience group that overlap with the table above. (NBC, if you are grousing about ratings for The Office, hopefully these numbers will make you feel a little better.)

Ratings Table 2

The final table addresses my third question about the TV networks and types of shows that are best at building an online audience. ABC has done more than twice as well as CBS in getting viewers online, which may be a reflection of the traditionally older CBS audience. Note: I pulled the top-end outliers (American Idol, You Think You Can Dance?, and Deal or No Deal) from the Network comparison.

The second half of the table brings those TV series back into the mix in the reality/contest category, and you can see the impact. I was surprised at the dearth of sitcoms on this list. It may be that a website for a sitcom doesn’t typically make sense.

Ratings Table 3

With all the money spent on TV advertising, I can only hope the networks go beyond the top-line Nielsen ratings to try to get a complete picture of their audiences.

Topics:
,



Misaligned goals, distorted behaviors, and a misguided sense of success… no, I’m not referring to college graduates. I’m talking about the problems caused by using the wrong metrics in your organization. You’ve probably seen examples like tracking average customer profitability and losing perspective on the variance in profitability or evaluating customer service reps on calls handled without regard for the quality of the experience. I’d like to offer up a quick-bake recipe for choosing the right metric.

Step 1: Set the context

Metrics generally serve one of two purposes. Start by understanding what you are trying to achieve.

1. Identifying problems. Defining the right metrics in this case requires you to do a little detective work: What is the data residue of a problem? What evidence can be found and how exactly does it show up?

2. Measuring performance. The right success metrics need to focus on measures that can be controlled and where improvement in the number is unabiguously a good thing.

Step 2: Balance the four dimensions of a good metric

Metrics Framework

Lots of metrics fail in at least one of these dimensions. A few examples:

  • Common interpretation: We had a client who made a distinction between “leads” and “prospects” in their marketing organization. Prospects had theoretically expressed more interest in the service through their actions. Unfortunately the line between leads and prospects was always hard to decipher and the definitions were hard to communicate. On a related note, we got a kick out of Tom Davenport’s (author of “Competing on Analytics”) assertion that a company competing on analytics needs to “invent proprietary metrics for use in key business processes.” There is nothing inherently wrong with “invented proprietary metrics” but it sounds like something that is designed to confuse anyone outside of the inner sanctum.
  • Actionable: Metrics are frequently too broad for the impact that a particular group can have. Customer satisfaction is a popular dashboard staple, but it is hard for most managers to see how they can have a significant impact on the number.
  • Accessible, credible data: Sometimes the most valuable and obvious metrics are frustratingly hard to track. In the web analytics world, unique visitors is important to know, but user deletion of cookies has thrown a wrench into the works.
  • Transparent, simple calculation: Top NFL agent Leigh Steinberg says of the famous quarterback ratings metric:”Other than one attorney in our office, I am unaware of a single human being who has the capacity to figure a quarterback rating.” I don’t know what kind of art majors he hires, but all they need to do is use the simplified formula: (83.33 * Comp %) + (4.16667 * Yds per att) + (333.333 * TD pct) – (416.667 * INT pct) + 25/12.

(Want a little validation of this framework? Avinash, respected web analytics guru, just published a post with “Four Attributes of Great Metrics” and he landed on a strikingly similar set of four: 1) instantly useful (i.e. actionable); 2) relevant (i.e. common interpretation); 3) timely (i.e. accessible); 4) uncomplex (i.e. transparent and simple).)

Step 3: Avoid the metrics bugaboos

Finally, here are a few traps that I’ve seen in deciding on appropriate metrics:

  • Trending and distributions: Don’t always try to compress a metric into a single number. Often it is more revealing to show the metric across time or as a distribution to uncover variance.
  • Edge cases: There will always edge cases where a metric may not mean what you think it means. These situations are worth understanding, but you shouldn’t allow the perfect to be the enemy of the good.
  • Setting goals: Could you hold someone accountable for this metric without them throwing out a half-dozen reasons why it doesn’t make sense? It’s a decent test of the value of the metric.
  • Self-serving: Be careful that you don’t select metrics simply because you know they’ll make you look good.


Topics:
,



Page 1 of 712345...Last »