Twitter Analytics for "Analytics"
By Zach Gemignani
March 27, 2009
Find more about:
twitter,
analytics,
twitter
analytics
Twitter’s wild popularity hasn’t obscured the fact that the service needs to eventually make money. The concept of “Twitter analytics” as a revenue stream has come up often enough to make my ears itch and my nose burn.
Twitter’s new business development lead explains that the company is “developing a range of analytics and metrics products and services built around the information contained in tweets”…and “trying to figure out what are the appropriate metrics around engagement and how to convey those.”
Web Strategist Jeremiah Owyang raises the concept of a Twitter CRM solution, in which Twitter would offer their own analytics system to brands, that will help them to track and manage the conversations.
The Twitter ecosystem has responded with a wide range of tools for analysis of Twitter data. Web analytics behemoth Omniture recently announced the integration of Twitter data into their platform. At the same time, web analytics consultant Eric T. Peterson has been vigorously marketing Twitalyzer, a tool to evaluate individuals’ use of Twitter and metrics of influence. Google’s Chrome Experiments released a cool visualization tool called Social Collider that reveals cross-connections between conversations on Twitter. Here are a few more Twitter analytics tools that I’ve run across:
Despite all the activity, I haven’t yet seen a solution that offers the kind of valuable analytics that a company could use to understand the Twitter conversation relevant to their business. The applications above are either focused on the measurement of individual Twitter users or offer a high-level tracking of words and phases in the general conversation. They treat tweets as transactions — How many? How valuable? Who’s listening? Who’s responding?
To me, the great and more rewarding challenge in Twitter analytics is to synthesize the substance of those conversations. Imagine if you went to a party and could overhear everything that everyone else was saying. Who talked the most and who had the greatest audience is less interesting than what topics people were discussing and what was said.
I wanted to take a shot at this type of Twitter analytics.
Analysis Approach
First I had to define a particular domain or topic area. For expediency, I focused on all the tweets that included the word “analytics.” Using the Twitter search API, I pulled the first 500 tweets for each day in March and parsed the results to pull out users, urls, and other characteristics of the tweets.
To analyze the words and phrases being used, I uploaded the resulting 11,300 tweets into Concentrate, our search analytics tool. Concentrate is optimized for search query text (i.e. short phrases without a lot of punctuation). Nevertheless, it has a number of features that make text analysis easier, including breaking out the most common words, phrases and patterns. It also allows for filtering by words to create frequency statistics.
There were two main questions I wanted to address:
- What topics are people discussing?
- What is the structure of the conversation?
Topics of Conversation
The content of the Twitter conversation can be analyzed as words, sites/links, people/groups, and company/products.
Words
I used Concentrate to find the most common words, then I dumped those words into Many Eyes to create this “Wordle-brand” word cloud. Many Eyes has a nice feature that takes out the “common English words.” Clearly Google dominates the conversation, and I even had to artificially reduced the value to make the other words legible.

Below are the top 10 (non-common) words that show up in the analytics conversation

Sites and Links
Twitter has become a mechanism for sharing interesting links (I’ll get to data on that in a bit). Looking at the most popular sites and specific links gives a sense for what people in this community are reading and talking about.

People and Groups
Twitter users have a few conventions for connecting tweets to people or groups:
- ”#” (i.e. hashtag) associates the message with associated with a group, topic or event.
- “RT” (or “via”) is to repeat or “retweet” something someone else has said.
- ”@” associates a tweet with another user, whether retweeting their message or directing a comment to them.
Here are the most common groups and people referenced in the Twitter data.

And the people with the most tweets using the word “analytics”

Companies and Products
I was also interested in what companies or products were referred to most frequently. It is no surprise that Google dominates the conversation. Microsoft gets on the board with the recently closing of their adCenter product. I think we can safely assume they won’t be showing up that often in the future.

Conversation Structure
Beyond the specific content of the conversation, I was also curious about how people who are talking about analytics tend to use Twitter.
Types of Tweets
Eric T. Peterson has four things he considers “signal” (versus “noise”) in the Twitter conversation:
- References to other people (defined by the use of “@” followed by text)
- Links to URLs you can visit (defined by the use of “http://” followed by text)
- Hashtags you can explore and participate with (defined by the use of “#” followed by text)
- Retweets of other people, passing along information (defined by the use of “rt”, “r/t/”, “retweet” or “via”)
While I’m not fond of this definition, examining these different types of tweets (along with question-based tweets) provides a good lens into the nature of the conversation. The following chart shows the percentage of tweets that fall into each of those categories.

It would be all the more interesting if you could follow the types of tweets across time and compare against other topic areas. I suspect that the URL linking within Twitter is on the rise and is turning Twitter into a Delicious-style bookmark sharing service — without the functionality to save, tag, annotate, and view the bookmarks at your leisure.
Link Evolution
Given all the sharing of links, I wanted to get a clearer picture of what happens when a link becomes popular. The graphic below shows some of the top links during the month and the amount they showed up in tweets by day. The red bars represent days where ten or more tweets included the link. A couple links demonstrated popularity over a week or so, but the rest sizzled then disappeared in a day or two.

Activity Distribution
Finally, I took a look at the distribution of users by the number of tweets including the word “analytics.” It was no surprise that the vast majority of the 7,700 twitterers only used the word once in March (of course this doesn’t tell us about their other twittering activity). Obviously there is a small population of people at the core of the discussion.

While you'd have to go into more depth to answer detailed questions, there are a number of interesting take-aways for me, including:
- “Analytics” means “web analytics”, not business intelligence or general reporting about sales, operations, or marketing.
- Google Analytics is the star of the party. Of course, the fact that the brand name includes "analytics" is an advantage, but I didn't see a giant "Juice" in the word cloud.
- Twitter is an echo-chamber. The content clusters around particular subjects, with people retweeting and sharing links about the big news of the day. There are a dozen or so stories that dominated the conversation over this time period.
What’s next?
There are a lot more views of this data that could be enlightening for a company interested having a real-time understanding of their marketplace. For example, it would be interesting to provide more insight into:
- Who is at the center of these conversations?
- What is the positive or negative tone of the discussion (Twitter actually offers this information as part of their API)?
- How has is the conversation changing over time?
- What is the best way to define the boundaries of a domain-specific conversation?
These are the types of questions that I’d like to see addressed in a more complete Twitter analytics tool.
Visitors Guide to the Juice Blog and Website
By Zach Gemignani
March 23, 2009
Find more about:
With almost 300 blog posts and dozens of free tools and demos, we thought it would be useful to offer some of the highlights from the Juice blog and website.
Our Views on Analytics and Communicating Data
- A perspective on business intelligence
- The last mile problem
- “Business intelligence isn’t a technical problem”
- Why analytical applications fail
Information Experiences™, Dashboards and Metrics
- Designing Information Experiences™
- Features of successful real-time dashboards
- Choosing the right metric
Demos
- Stimulus Bill Explorer (introductory post)
- JuiceKit™ Economic Census Treemap (introductory post)
- Examples of JuiceKit™ visualizations
- Contact us to see examples of our client work
Analytics Tools (Free stuff!)
Visualization
Web analytics
- Concentrate™, a long-tail search analytics solution (and how it works)
- Enhanced Google Analytics Firefox add-on (introductory post)
- Google Trends API
- Google Analytics API
Excel and charting
- Chart Chooser™ (a fan favorite)
- Dynamic Excel reporting framework
- Chart Cleaner
Mapping
Excel Tricks
- Lightweight data exploration in Excel and More on in-cell Excel graphing
- Exploring data in Excel with conditional formatting
- Essential Excel skills and Excel training worksheet
- Why make 100 charts when one will do?
Just for Fun
US Economic Census Treemap
By Zach Gemignani
March 20, 2009
Find more about:
treemap,
visualization
Now that I’ve got treemaps on the brain, I keep noticing how many things could be better understood using this visualization technique. A few examples:

We thought it would be a nice demonstration to use data from the 1997 and 2002 US Economic Census (unfortunately 2007 isn't out yet) to see what kind of stories bubble forth. The demonstration was built using a component from JuiceKit™, our recently open sourced Software Development Kit (SDK) for building Information Experience™ applications. The SDK can be used by web designers and developers to build graphically rich and interactive information displays. JuiceKit currently integrates with Adobe Flex to create components that are easy to implement and aesthetically pleasing.
Check out the treemap here.
Here are a few of the macro-trends that I found:
- The rise of CostCo, Amazon, and Home Depot: This time period saw strong growth in warehouse clubs and superstores, online retailers (“electronic shopping”), and home centers.
- From manufacturing to services economy: Most of the growth was in service sectors (financial services, healthcare, professional services) while manufacturing was shrinking.
- Productivity gains, even in adversity: For struggling sectors, the employee declines almost always outpaced the sales declines — squeezing more sales per employee.
- Demographic shifts: Homes and services for the elderly were among the strongest areas of growth in the category of “healthcare and social assistance.”
And there were lots of little insights as well:
- No wonder hospital TV shows are so popular: Hospitals are the largest single employer as a business-type.
- Starbucks and Krispy Kreme steal the unhealthy food dollar: Cookies and frozen yogurt retail saw a rapid decline while coffee and donut shops flourished.
- Goodbye stand-alone pump: Gas stations with convenience stores overtook the just-plain gas station.
- It can’t last, can it?: Mortgage broker payroll up 177%.
Once you understand how to read treemaps, they are great for exploring data like this: hierarchical with both quantity and quality-type measures. In a true testament to their power, my wife admitted this visualization was “kinda interesting.”
1 comment
Travis said:
A small question about the presentation, or maybe the data: regardless of the metric chosen (establishments, sales, employees or payroll), the data points are shown in dollars. I would have thought establishments and employees were just numbers of each. Or has the census monetized them in some way?
Thanks. (And your wife is right: this is kinda interesting.)






16 comments | Show all comments only the last 5 are shown
Jamie said:
Excellent post thanks! For your link analysis how do you deal with short urls like tinyurl.com?
--
Jamie
Zach said:
We used the API from http://www.longurlplease.com/ . A surprising number of tinyurls couldn't be converted back. Public service: please don't use a tinyurl if you don't have to.
WhizGidget said:
Excellent, and timely, article on Twitter analytics. Tone would be a very interesting slice of the data to look at, as well as how many use Twitter as a substitute instant messaging platform - that is to say, how many back and forth conversations occur between two users, or who has the highest response mechanism.
If a business is going to use Twitter to communicate to the masses, especially using it for customer service then a good question to ask of the data is this: How many tweets are sent to this company as compared to how many responses are made - and are responses made to individuals from the company or are they simply general messages to the public about the state of the business. Yes, that's more on the transactional side than the quality of content side of analysis, but it's important when you're ranking how well a customer communicates using Twitter.
Liviu Taloi said:
Hi, useful post. How did you that graphics. Did you make the analysis by hand or using some sort of tools? I mean for picture 2 - 10. Thanks.
Liviu Taloi said:
I just saw it at the beginning of your article. Nice product.
Zach said:
Liviu, The charts were made in Apple Keynote -- except for the "link evolution" chart which was done in Excel.
Jacob said:
TwitterAnalyzer is the best, one of the most advanced analytic systems ever, all the data you see here can be found in http://Twitteranalyzer.com and much more. words analysis, links, friends location on Google maps, RT analyzer, best friends, disregarded friends, all published pictures,how many twitter users have been exposed to your messages(through RT), and one of the most powerful yet (for my taste) is a real time estimation of the number of your followers currently on twitter.com. very powerful, enjoy the Google Analytics for Twitter!!!
ian farmer said:
Invaluable information, I have commented on other sites that the people making the money in the gold rush were the people selling the picks and shovels, not the gold miners - a survey of "business outcomes" from Twitter adoption in your marketing mix would be great. So many people just "do not get it", insisting Twitter is a fad. We are generating leads which is turning to business. Thanks you for sharing this information.
Zach said:
Jacob, I think you may have missed my point (or may not care): TwitterAnalyzer, Twitalyzer and other such products focus on analytics for the individual, like GA offers for a site. I'm more interested in synthesizing the larger conversation.
Eric T. Peterson said:
Guys,
Regarding synthesizing the larger conversation ... have you had a look at <a href="http://brand.twitalyzer.com">Twitalyzer BRAND</a> yet? It is a new app from Twitalyzer that does exactly what you are talking about ... getting "bigger picture" in Twitter.
Anyway, love your feedback and would love to pick up the conversation about Twitalyzer + Juice now that the APIs are ready. Let me know when you have bandwidth.
Thanks!
E.
Peter Isaksson said:
I think you on to something here. My interest is how to analyze the value of Twitter and other social media? Or should we treat Twitter just like another traffic source and therefore analyze as a trafficsource? To analyze the outcome of a social media strategy could involve things like engagement and revenue. But what i think many tools are trying to do is to analyze the "buzzactivity" and not the outcome of the strategy. Because thats what social media is all about when it comes to marketing a company. A new way to reach out to the target audience, a marketing strategy. Or am I totaly out of the blue here?
Achbar Jones said:
Have you seen tmitter.com? It's great, simple and free!
jeffrey Greenberg said:
you might look at the alpha http://www.tweettronics.com which integrates brand monitoring with influence measures for Twitter.
James Beamish-White said:
You might want to take a look at http://www.twilitics.com, for tracking actual interest in links posted.
Satya Prakash said:
Interesting data
Chris Henry said:
These are all great sites, that are doing really cool stuff with data. One important gap seems to be simple measurement of clickthrough on Tweets. I built http://140ctr.com, hoping to fill that gap. Twitter users just fill in their username, and their CTR will be calculated.
said:
Add a comment