Twitter Analytics for “Analytics”

Twitter’s wild popularity hasn’t obscured the fact that the service needs to eventually make money. The concept of “Twitter analytics” as a revenue stream has come up often enough to make my ears itch and my nose burn.

Twitter’s new business development lead explains that the company is “developing a range of analytics and metrics products and services built around the information contained in tweets”…and “trying to figure out what are the appropriate metrics around engagement and how to convey those.”

Web Strategist Jeremiah Owyang raises the concept of a Twitter CRM solution, in which Twitter would offer their own analytics system to brands, that will help them to track and manage the conversations.

The Twitter ecosystem has responded with a wide range of tools for analysis of Twitter data. Web analytics behemoth Omniture recently announced the integration of Twitter data into their platform. At the same time, web analytics consultant Eric T. Peterson has been vigorously marketing Twitalyzer, a tool to evaluate individuals’ use of Twitter and metrics of influence. Google’s Chrome Experiments released a cool visualization tool called Social Collider that reveals cross-connections between conversations on Twitter. Here are a few more Twitter analytics tools that I’ve run across:

Despite all the activity, I haven’t yet seen a solution that offers the kind of valuable analytics that a company could use to understand the Twitter conversation relevant to their business. The applications above are either focused on the measurement of individual Twitter users or offer a high-level tracking of words and phases in the general conversation. They treat tweets as transactions — How many? How valuable? Who’s listening? Who’s responding?

To me, the great and more rewarding challenge in Twitter analytics is to synthesize the substance of those conversations. Imagine if you went to a party and could overhear everything that everyone else was saying. Who talked the most and who had the greatest audience is less interesting than what topics people were discussing and what was said.

I wanted to take a shot at this type of Twitter analytics.

Analysis Approach

First I had to define a particular domain or topic area. For expediency, I focused on all the tweets that included the word “analytics.” Using the Twitter search API, I pulled the first 500 tweets for each day in March and parsed the results to pull out users, urls, and other characteristics of the tweets.

To analyze the words and phrases being used, I uploaded the resulting 11,300 tweets into Concentrate, our search analytics tool. Concentrate is optimized for search query text (i.e. short phrases without a lot of punctuation). Nevertheless, it has a number of features that make text analysis easier, including breaking out the most common words, phrases and patterns. It also allows for filtering by words to create frequency statistics.

There were two main questions I wanted to address:

  1. What topics are people discussing?
  2. What is the structure of the conversation?

Topics of Conversation

The content of the Twitter conversation can be analyzed as words, sites/links, people/groups, and company/products.


I used Concentrate to find the most common words, then I dumped those words into Many Eyes to create this “Wordle-brand” word cloud. Many Eyes has a nice feature that takes out the “common English words.” Clearly Google dominates the conversation, and I even had to artificially reduced the value to make the other words legible.

Word cloud

Below are the top 10 (non-common) words that show up in the analytics conversation

Top words

Sites and Links

Twitter has become a mechanism for sharing interesting links (I’ll get to data on that in a bit). Looking at the most popular sites and specific links gives a sense for what people in this community are reading and talking about.

Top sites and links

People and Groups

Twitter users have a few conventions for connecting tweets to people or groups:

  • ”#” (i.e. hashtag) associates the message with associated with a group, topic or event.
  • “RT” (or “via”) is to repeat or “retweet” something someone else has said.
  • ”@” associates a tweet with another user, whether retweeting their message or directing a comment to them.

Here are the most common groups and people referenced in the Twitter data.

Top people and groups

And the people with the most tweets using the word “analytics”

Top talkers

Companies and Products

I was also interested in what companies or products were referred to most frequently. It is no surprise that Google dominates the conversation. Microsoft gets on the board with the recently closing of their adCenter product. I think we can safely assume they won’t be showing up that often in the future.

Top companies

Conversation Structure

Beyond the specific content of the conversation, I was also curious about how people who are talking about analytics tend to use Twitter.

Types of Tweets

Eric T. Peterson has four things he considers “signal” (versus “noise”) in the Twitter conversation:

  • References to other people (defined by the use of “@” followed by text)
  • Links to URLs you can visit (defined by the use of “http://” followed by text)
  • Hashtags you can explore and participate with (defined by the use of “#” followed by text)
  • Retweets of other people, passing along information (defined by the use of “rt”, “r/t/”, “retweet” or “via”)

While I’m not fond of this definition, examining these different types of tweets (along with question-based tweets) provides a good lens into the nature of the conversation. The following chart shows the percentage of tweets that fall into each of those categories.

Tweet Types

It would be all the more interesting if you could follow the types of tweets across time and compare against other topic areas. I suspect that the URL linking within Twitter is on the rise and is turning Twitter into a Delicious-style bookmark sharing service — without the functionality to save, tag, annotate, and view the bookmarks at your leisure.

Link Evolution

Given all the sharing of links, I wanted to get a clearer picture of what happens when a link becomes popular. The graphic below shows some of the top links during the month and the amount they showed up in tweets by day. The red bars represent days where ten or more tweets included the link. A couple links demonstrated popularity over a week or so, but the rest sizzled then disappeared in a day or two.

Link Evolution

Activity Distribution

Finally, I took a look at the distribution of users by the number of tweets including the word “analytics.” It was no surprise that the vast majority of the 7,700 twitterers only used the word once in March (of course this doesn’t tell us about their other twittering activity). Obviously there is a small population of people at the core of the discussion.

Activity Distribution

While you’d have to go into more depth to answer detailed questions, there are a number of interesting take-aways for me, including:

  • “Analytics” means “web analytics”, not business intelligence or general reporting about sales, operations, or marketing.
  • Google Analytics is the star of the party. Of course, the fact that the brand name includes "analytics" is an advantage, but I didn’t see a giant "Juice" in the word cloud.
  • Twitter is an echo-chamber. The content clusters around particular subjects, with people retweeting and sharing links about the big news of the day. There are a dozen or so stories that dominated the conversation over this time period.

What’s next?

There are a lot more views of this data that could be enlightening for a company interested having a real-time understanding of their marketplace. For example, it would be interesting to provide more insight into:

  • Who is at the center of these conversations?
  • What is the positive or negative tone of the discussion (Twitter actually offers this information as part of their API)?
  • How has is the conversation changing over time?
  • What is the best way to define the boundaries of a domain-specific conversation?

These are the types of questions that I’d like to see addressed in a more complete Twitter analytics tool.