Twitter Analytics for "Analytics"


Twitter’s wild popularity hasn’t obscured the fact that the service needs to eventually make money. The concept of “Twitter analytics” as a revenue stream has come up often enough to make my ears itch and my nose burn.

Twitter’s new business development lead explains that the company is “developing a range of analytics and metrics products and services built around the information contained in tweets”…and “trying to figure out what are the appropriate metrics around engagement and how to convey those.”

Web Strategist Jeremiah Owyang raises the concept of a Twitter CRM solution, in which Twitter would offer their own analytics system to brands, that will help them to track and manage the conversations.

The Twitter ecosystem has responded with a wide range of tools for analysis of Twitter data. Web analytics behemoth Omniture recently announced the integration of Twitter data into their platform. At the same time, web analytics consultant Eric T. Peterson has been vigorously marketing Twitalyzer, a tool to evaluate individuals’ use of Twitter and metrics of influence. Google’s Chrome Experiments released a cool visualization tool called Social Collider that reveals cross-connections between conversations on Twitter. Here are a few more Twitter analytics tools that I’ve run across:

Despite all the activity, I haven’t yet seen a solution that offers the kind of valuable analytics that a company could use to understand the Twitter conversation relevant to their business. The applications above are either focused on the measurement of individual Twitter users or offer a high-level tracking of words and phases in the general conversation. They treat tweets as transactions — How many? How valuable? Who’s listening? Who’s responding?

To me, the great and more rewarding challenge in Twitter analytics is to synthesize the substance of those conversations. Imagine if you went to a party and could overhear everything that everyone else was saying. Who talked the most and who had the greatest audience is less interesting than what topics people were discussing and what was said.

I wanted to take a shot at this type of Twitter analytics.


Analysis Approach

First I had to define a particular domain or topic area. For expediency, I focused on all the tweets that included the word “analytics.” Using the Twitter search API, I pulled the first 500 tweets for each day in March and parsed the results to pull out users, urls, and other characteristics of the tweets.

To analyze the words and phrases being used, I uploaded the resulting 11,300 tweets into Concentrate, our search analytics tool. Concentrate is optimized for search query text (i.e. short phrases without a lot of punctuation). Nevertheless, it has a number of features that make text analysis easier, including breaking out the most common words, phrases and patterns. It also allows for filtering by words to create frequency statistics.

There were two main questions I wanted to address:

  1. What topics are people discussing?
  2. What is the structure of the conversation?

Topics of Conversation

The content of the Twitter conversation can be analyzed as words, sites/links, people/groups, and company/products.

Words

I used Concentrate to find the most common words, then I dumped those words into Many Eyes to create this “Wordle-brand” word cloud. Many Eyes has a nice feature that takes out the “common English words.” Clearly Google dominates the conversation, and I even had to artificially reduced the value to make the other words legible.

Word cloud

Below are the top 10 (non-common) words that show up in the analytics conversation

Top words

Twitter has become a mechanism for sharing interesting links (I’ll get to data on that in a bit). Looking at the most popular sites and specific links gives a sense for what people in this community are reading and talking about.

Top sites and links

People and Groups

Twitter users have a few conventions for connecting tweets to people or groups:

  • ”#” (i.e. hashtag) associates the message with associated with a group, topic or event.
  • “RT” (or “via”) is to repeat or “retweet” something someone else has said.
  • ”@” associates a tweet with another user, whether retweeting their message or directing a comment to them.

Here are the most common groups and people referenced in the Twitter data.

Top people and groups

And the people with the most tweets using the word “analytics”

Top talkers

Companies and Products

I was also interested in what companies or products were referred to most frequently. It is no surprise that Google dominates the conversation. Microsoft gets on the board with the recently closing of their adCenter product. I think we can safely assume they won’t be showing up that often in the future.

Top companies


Conversation Structure

Beyond the specific content of the conversation, I was also curious about how people who are talking about analytics tend to use Twitter.

Types of Tweets

Eric T. Peterson has four things he considers “signal” (versus “noise”) in the Twitter conversation:

  • References to other people (defined by the use of “@” followed by text)
  • Links to URLs you can visit (defined by the use of “http://” followed by text)
  • Hashtags you can explore and participate with (defined by the use of “#” followed by text)
  • Retweets of other people, passing along information (defined by the use of “rt”, “r/t/”, “retweet” or “via”)

While I’m not fond of this definition, examining these different types of tweets (along with question-based tweets) provides a good lens into the nature of the conversation. The following chart shows the percentage of tweets that fall into each of those categories.

Tweet Types

It would be all the more interesting if you could follow the types of tweets across time and compare against other topic areas. I suspect that the URL linking within Twitter is on the rise and is turning Twitter into a Delicious-style bookmark sharing service — without the functionality to save, tag, annotate, and view the bookmarks at your leisure.

Given all the sharing of links, I wanted to get a clearer picture of what happens when a link becomes popular. The graphic below shows some of the top links during the month and the amount they showed up in tweets by day. The red bars represent days where ten or more tweets included the link. A couple links demonstrated popularity over a week or so, but the rest sizzled then disappeared in a day or two.

Link Evolution

Activity Distribution

Finally, I took a look at the distribution of users by the number of tweets including the word “analytics.” It was no surprise that the vast majority of the 7,700 twitterers only used the word once in March (of course this doesn’t tell us about their other twittering activity). Obviously there is a small population of people at the core of the discussion.

Activity Distribution


While you'd have to go into more depth to answer detailed questions, there are a number of interesting take-aways for me, including:

  • “Analytics” means “web analytics”, not business intelligence or general reporting about sales, operations, or marketing.
  • Google Analytics is the star of the party. Of course, the fact that the brand name includes "analytics" is an advantage, but I didn't see a giant "Juice" in the word cloud.
  • Twitter is an echo-chamber. The content clusters around particular subjects, with people retweeting and sharing links about the big news of the day. There are a dozen or so stories that dominated the conversation over this time period.

What’s next?

There are a lot more views of this data that could be enlightening for a company interested having a real-time understanding of their marketplace. For example, it would be interesting to provide more insight into:

  • Who is at the center of these conversations?
  • What is the positive or negative tone of the discussion (Twitter actually offers this information as part of their API)?
  • How has is the conversation changing over time?
  • What is the best way to define the boundaries of a domain-specific conversation?

These are the types of questions that I’d like to see addressed in a more complete Twitter analytics tool.


This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

16 comments | Show all comments only the last 5 are shown


April 30, 2009
Achbar Jones said:

Have you seen tmitter.com? It's great, simple and free!


July 31, 2009
jeffrey Greenberg said:

you might look at the alpha http://www.tweettronics.com which integrates brand monitoring with influence measures for Twitter.


August 4, 2009
James Beamish-White said:

You might want to take a look at http://www.twilitics.com, for tracking actual interest in links posted.


October 26, 2009
Satya Prakash said:

Interesting data


September 2, 2010
Chris Henry said:

These are all great sites, that are doing really cool stuff with data. One important gap seems to be simple measurement of clickthrough on Tweets. I built http://140ctr.com, hoping to fill that gap. Twitter users just fill in their username, and their CTR will be calculated.

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment





Visitors Guide to the Juice Blog and Website

With almost 300 blog posts and dozens of free tools and demos, we thought it would be useful to offer some of the highlights from the Juice blog and website.

Our Views on Analytics and Communicating Data


Information Experiences™, Dashboards and Metrics


Demos


Analytics Tools (Free stuff!)

Visualization

Web analytics

Excel and charting

Mapping


Excel Tricks


Just for Fun

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

0 comments | Add a comment

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment






US Economic Census Treemap

Now that I’ve got treemaps on the brain, I keep noticing how many things could be better understood using this visualization technique. A few examples:

treemap ideas

We thought it would be a nice demonstration to use data from the 1997 and 2002 US Economic Census (unfortunately 2007 isn't out yet) to see what kind of stories bubble forth. The demonstration was built using a component from JuiceKit™, our recently open sourced Software Development Kit (SDK) for building Information Experience™ applications. The SDK can be used by web designers and developers to build graphically rich and interactive information displays. JuiceKit currently integrates with Adobe Flex to create components that are easy to implement and aesthetically pleasing.

Check out the treemap here.

US Economic Census Treemap

Here are a few of the macro-trends that I found:

  • The rise of CostCo, Amazon, and Home Depot: This time period saw strong growth in warehouse clubs and superstores, online retailers (“electronic shopping”), and home centers.
  • From manufacturing to services economy: Most of the growth was in service sectors (financial services, healthcare, professional services) while manufacturing was shrinking.
  • Productivity gains, even in adversity: For struggling sectors, the employee declines almost always outpaced the sales declines — squeezing more sales per employee.
  • Demographic shifts: Homes and services for the elderly were among the strongest areas of growth in the category of “healthcare and social assistance.”

And there were lots of little insights as well:

  • No wonder hospital TV shows are so popular: Hospitals are the largest single employer as a business-type.
  • Starbucks and Krispy Kreme steal the unhealthy food dollar: Cookies and frozen yogurt retail saw a rapid decline while coffee and donut shops flourished.
  • Goodbye stand-alone pump: Gas stations with convenience stores overtook the just-plain gas station.
  • It can’t last, can it?: Mortgage broker payroll up 177%.

Once you understand how to read treemaps, they are great for exploring data like this: hierarchical with both quantity and quality-type measures. In a true testament to their power, my wife admitted this visualization was “kinda interesting.”

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. All source code is released under a BSD License unless otherwise specified.

1 comment


March 25, 2009
Travis said:

A small question about the presentation, or maybe the data: regardless of the metric chosen (establishments, sales, employees or payroll), the data points are shown in dollars. I would have thought establishments and employees were just numbers of each. Or has the census monetized them in some way?

Thanks. (And your wife is right: this is kinda interesting.)

Your name

Email (optional, will not be shared)

Type the word "juice" (required to confuse the spammers)

Your comment


Add a comment