Looking at Trees to Understand the Forest
By Zach Gemignani
March 16, 2010
Find more about:
analytics,
visualization
David Simon (of The Wire fame) has sucked me into another brilliant television series with Generation Kill. It is the story of a Marine recon unit at the beginning of the Iraq war. At the heart of all the action, the seven-part miniseries offers an intimate and honest profiles of individual Marines.
The characters don't so much displace stereotypes as reveal texture and insight about the unique qualities of individual Marines.
The series got me thinking once again about different ways to analyze data. Almost four years ago, I posted a couple blog posts (Part 1 and Part 2) making a case for analyzing and visualizing data at a granular level to uncover patterns and behaviors. Generation Kill is a case study in looking closely at the individual trees to understand the forest.
Analytics is a journey of exploration--a continuous series of iterations with the goal of deeper understanding based on better questions and more targeted analyses. Einstein said:
"To raise new questions, new possibilities, to regard old problems from a new angle, requires creative imagination and marks real advance in science."
How to arrive at new questions?
In the previous blog post, I described examples from online learning, credit cards usage, and football film study to show how granular analysis can spur new questions. I've stumbled across a series of new examples recently:
Surveys. Survey analysis is hard work--just ask Ken who recently presented results from Juice's survey on the practice of information visualization in organizations. If a survey is mostly about understanding your audience, rolling up responses by questions can't be the only approach (though it is the most common). Cross tabs ("displays the joint distribution of two or more variables") are one direction to go. Another approach is to look for people who share common characteristics or patterns in their responses.
Macrofocus' SurveyVisualizer is the most innovative survey analysis tool I've seen and it emphasizes data at a granular level.

"All the analysis elements are always shown as grey lines in the background. This provides an overview of the ranges and spreads of the individual values for each node, and facilitates the detection of outliers." (from Visualization of Large-Scale Customer Satisfaction Surveys Using a Parallel Coordinate Tree)
Medical research. Research studies are conducted against carefully defined target and control populations with aggregate statistics across these populations required for conclusions. However, the ability to review the patterns of diagnoses and procedures at the individual patient-level can help test assumptions about the target population and refine the parameters of a study. Better model inputs; better results.
Speech analytics. Michel Guillet at Nexidia recently told me about their approach to speech data:
Nexidia’s speech analytics can mine thousands of hours of audio to categorize, correlate or spot trends. However, it is quite often in identifying and listening to a lone outlier that the application provides its most valuable insights. Some examples of outliers can be the very long call of a particular call type, the extremely abrupt one, the one with the most languages spoken or the one where no one is speaking at all. An outlier can change your hypotheses and put you in a different direction…perhaps a better one. Nexidia’s reporting and analysis tools offer many different methodologies including histograms, analysis of means charts and flexible filtration by meta-data to identify outliers in large amounts of data. In addition, Nexidia’s ad-hoc search functionality allows users to search an entire body of audio content at any time, which is often helpful to find the “smoking gun” or a single recording which can make or break an argument.
Of course you can't be assured of a full or accurate picture when looking at granular data, but somewhere between standard aggregation-based analysis and granular views lies the truth.
The Best of Business Intelligence: Innovation at the Fringe
By Zach Gemignani
June 28, 2009
Find more about:
business
intelligence,
dashboard,
analytics,
visualization
Enough complaining about the broken bits of Business Intelligence; it's time to highlight the things that are good and right in the industry. Like most industries, the renewal and innovation occurs at the fringe, beyond the comfort zone of established vendors.
I've created five categories and a catch-all to capture the solutions and companies (not so much technologies) that are leading the next generation of Business Intelligence. The categories are:
- Analyst tools
- Dashboards
- Targeted solutions
- Open-source and free
- Advanced visualizations
- Other stuff
Naturally I've focused on areas of Juice expertise and focus -- not coincidentally, the places where we feel BI has neglected end-users. According to a study by the Business Application Research Center, BI end-user adoption sits at a lowly 8%.
I'm happy to take your suggestions (and update the post) for things I've missed in these categories or for entirely new categories.
Analyst tools
Tools that make it easy for analysts to pull data from multiple sources, analyze, visualize and share it.
Winner: Tableau, the reigning king of visual analytics tools, has added more web-based functionality to allow for online sharing and collaboration.

Runner-up: Good Data has arrived on the market with a web-first platform designed to democratize analytics. I had a chance to get a demo from the management team and was impressed with the ease of use and high-quality data presentation.

Dashboards
"A frequently updated analytical display that is clear and concise" (via a recent post)...and not likely to draw the rage of Stephen Few.
Winner: BonaVista Systems wants to make Excel a "first choice dashboard tool." From the humble position of sparkline plug-in vendor, BonaVista has taken a leadership role in encouraging more effective dashboard design.

Runner-up (tie): Two BI companies, Qlikview and Microstrategy, seem to be following BonaVista's lead. Unfortunately, they may only be dipping in a toe as I found just a couple examples that break from the traditional over-glossy, gauge-riddled dashboard interface.
Targeted solutions
Companies that serve a narrow slice of the BI world extremely well. The desire to be all things to all people has been an Achilles Heel of the BI industry. The general purpose BI platforms often prove too broad and too generic to serve the unique problems of specific industries or functional areas.
Winner: Wall Street on Demand is a brilliant, below-the-radar provider of information solutions to the financial sector. Their sparse, articulate marketing text and few screenshots hint at a company that knows exactly what they do and deliver high-quality BI solutions. I wish I knew more.

Runner-up (multiple): The following are just a few companies that have focused on an industry or functional segment to deliver targeted BI solutions:
- Quantivo for customer behavior analytics
- Visual I|O for pharmaceuticals
- LucidEra for sale pipeline reporting and analytics
Open-source and free
(I know there is a difference.)
Winner: Pentaho offers an open-source end-to-end BI suite that is a competitive alternative to the big-guys. Of course, the implementation it isn't necessarily cheap or easy.

Runner-up: If anything should scare the BI industry, it is the possibility of a Google Analytics model extended into more general data analysis and visualization tools. Google Fusion Tables may just be the tip of the iceberg.

Advanced visualizations
Bringing leading-edge visualization techniques out of academia and into the business world.
Winner: Many Eyes continues to impress with high-quality visualizations. They are easy to create and clean in design and usability. Impress your boss with a slick visualization in your next presentation.

Runner-up (tie): Openviz / Advanced Visual Systems and Panopticon appear to be the two BI vendors battling it out for leadership in advanced visualization solutions. Unlike Many Eyes, these guys lack Tufte-esque sophistication in infoviz design. That said, there is a big difference between creating a one-off New York Times-quality visualization and delivering a toolset that is re-usable in many different situations.
Other stuff to be admired
Free charts with good default design. InetSoft's Style Chart and Google Charts offer free, embeddable charts.
Jargon-free BI marketing. With few exceptions, BI web sites are densely populated with those awful stock-photography people sitting around conference tables (or worse, the ethnically-diverse V-formation marching at you) and meaningless business jargon and techno-babble. I really appreciate Blink Logic's web site with its straight talk and clean, readable design.
Beyond the desktop. RoamBI has a great-looking iPhone application that is designed to "transform your data into insightful, interactive visualizations delivered to the iPhone." It makes the Oracle and Qlikview iPhone apps look old-school.

16 comments | Show all comments only the last 5 are shown
Charles said:
Interesting choices, no mention of Xcelsius though, any reason why?
Howard said:
For open source... Actuate and BIRT... or Jaspersoft?
Zach said:
Charles, under what category would you consider Xcelsius a contender? From what I've seen, it doesn't encourage effective information design for dashboards (check out the shiny pies and speedometers in this screenshot: http://bit.ly/P5u9v). It is more presentation tool than analysis tool. And it doesn't do much to push the boundary of advanced visualization.
Clarence said:
Zach: You can checkout Zoho Reports (http://reports.zoho.com), next time around. Its a On-demand Reporting and Business Intelligence Service from Zoho, a leading provider of Online Office and Productivity suite.
I would also be happy to provide you a demo, if you require the same
Thanks,
Clarence
http://reports.zoho.com
Mail: clarence at zohocorp dot com
bts said:
This is a great summary. It gives many good pointers. Some comments have mentioned Jasper, BIRT among others. But these are remake of tools from 20 years ago. You picked a good set.
Keyur said:
Thanks for the great list. I use Tableau as well, and it's amazing. Wish they had a little more dashboard features, like the microcharts.
For microcharting, I use the open source and free:
http://sparklines-excel.blogspot.com/
T J Bate said:
Surprised you did not cover Visokio Omniscope..it has a lot to offer in ALL these catagories, and is often preferred to Tableau, QlikView, Excelsius and others. Should be on everyone's shortlist. Free to try:
http://www.visokio.com
Zach said:
TJ: Thanks for the suggestion on Visokio. It looks great and appears to be comparable to Tableau. I think you are overplaying your hand by saying it qualifies for ALL categories. It isn't open source/free, especially good at dashboard info presentation, or industry-focused (being used in particular industries is different).
Mike S said:
I believe you have miscategorized QlikView as a dashboarding rather than analytical tool. Yes, you can create dashboards with it, but that would be underselling its ability to combine millions of rows of data - and leave it at that granularity - from different types of data sources and perform analysis on it using their proprietary associative technology (see here for an explanation on that last point: http://demo.qlikview.com/AJAX/films/). The example you linked is lacking in dimensionality and may not be a great representation.
To suggest that they are following the lead of an Excel add-in company is funny; they actually consulted Stephen Few for their data visualization, which can be as muted or candy-like as you desire, in their latest release. And no, I don't work for QlikView.
Zach said:
Mike,
Thanks for the link. I appreciate that QlikView is a comprehensive BI platform. I wasn't trying to sell their product as much as point out ways that BI companies are pushing-the-envelope. From the looks of that demo, there may be other interesting ways that QlikView is innovating.
Qlikview may have consulted with Stephen Few, but it is hard for me to see the impact beyond that one dashboard. "Muted" design isn't the issue as much as poor choice of charts and distracting chart design.
Bjoern said:
Thanks for the great overview. I listed it in the Web Analytics TWINE: http://www.twine.com/twine/12v6ghwcm-1t1/web-analytics
ya_kokashko said:
Thanks for the great list. I use Tableau as well, and it's amazing. Wish they had a little more dashboard features. <a href=http://www.kokashko.com>kokashko</a>
ya_kokashko said:
Surprised you did not cover Visokio Omniscope
<a href=http://kokashko.com>kokashko</a>
Andrew said:
I have also been using Tableau and its just incredible how quickly it is to create very rich visualisations and share these with peers. It really does push the edge of the envelope and the mapping functionality puts the cherry on the top as I can now do a heap of spatial reports. Not one line of code!!!
Mario said:
Has anyone tried Jasper Soft?, I am researching it and would be great to hear opinions from the Juice Analytics community
Pentaho_evaluator said:
what is the difficulty level for the pentaho BI suite. how much time does it take a average ETL developer to use pentaho ETL and churn out reports and dashboards. I want to know how easy/difficult it is to get accustomed to and start producing results with Pentaho
Add a comment
Breaking Free of the One-Page Dashboard Rule
By Zach Gemignani
May 4, 2009
Find more about:
dashboard
interface
analytics
design
Conventional wisdom says that an executive dashboard must fit on a single page or screen. The argument hinges on a pair of assertions about this constraint: it provides necessary discipline to focus on only the most critical information; and it enables the audience to see results "at a glance."
The "discipline" argument is made forcefully by Avinash Kaushik (among others).
"if your dashboard does not fit on one page, you have a report, not a dashboard...This rule is important because it encourages rigorous thought to be applied in selecting the golden dashboard metric."
I buy wholeheartedly into the value of constraints. However, defining a useful constraint as a "rule" assumes there is only one viable means to achieve the desired ends. Confining visual real estate is but one way to focus your thinking. There are others: How about limiting yourself to five key measures? How about demanding that a dashboard can be understood in 3 minutes by a new user? How about only presenting exceptions?
The argument that a one-page dashboard necessarily provides an view of your business "at a glance" is more self-deceiving. Well-known information-ista Stephen Few uses this rationale in his definition of a dashboard:
A visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance. PDF
I check my speedometer "at a glance". I "glance" at a Heads-up Display (HUD) on a video game showing how much energy my character has remaining. These displays communicate but a single number that is already hovering on the corner of my consciousness. If we follow this advice literally, we'd show:

Assuming one page gives you quick, easy comprehension is like assuming all red cars are fast. That's simply not true. It must be duly noted, however, that all red cars are cool.
More often, people follow the one-page dashboard rule off a cliff like these folks.
There are real problems with this definition:

- In reality, the one-page rule leads to jamming information into the available space.
- When everything must fit on a page, there isn't room to describe the connections between information or fashion a story from the data.
- A good dashboard raises more questions than it can answer. Sticking to a static piece of paper limits any ability to find or present explanations.
Don't get me wrong: A one-page dashboard is often an effective way to create "a visual display of the most important information needed to achieve one or more objectives." But with streaming video, interactive visualizations, podcasts, Kindles, smart phones, video projectors...is it really necessary to limit ourselves to 8.5" x 11" piece of paper. Or might we open ourselves up to some more creative solutions to sharing the numbers; a short movie, a few slides, a short text narrative, or 140 characters.
I'd like to use this definition instead and will be back soon with some ideas on how to make your dashboards clear and concise.

10 comments | Show all comments only the last 5 are shown
Clint said:
Actually, the historical definition of dashboard is
"[an] instrument panel on an automobile or airplane containing dials and controls" (via answers.com). Don't know about you but some of those big airplanes have huge dashboards - no way to take it all in a single glance. So it sounds like you could take dashboard back to it's semantic roots, so to speak, without offering a new definition and still get to your desired outcome.
An improvement on this (which you reference but from the wrong frame) is the Heads-Up Display which shows critical info without the pilot/driver having to take their eyes off what's in front of them - which is much more like the single-page data dashboard than a car dashboard is like a data dashboard.
Back in the days of paper, single page data dashboards made a lot of sense - and to a certain extent still do. I haven't seen any usability studies comparing the two, but we know from web usability that folks generally don't like to scroll a page - how would that be any different in a dashboard? You don't want a user to abandon your dashboard too early in the same way you don't want them to abandon a web page too early and scrolling is a friction point that can cause users to abandon.
Michael Pierce said:
I like the definition...clear and concise itself.
This concept reminds me of debate that goes on with web designers. In this day and age, with newspapers dying off and crazy new devices coming out all of the time, what exactly is "one page" anymore? I can tell you, an 8.5x11 dashboard would be dismal to use on my BlackBerry. With an exceedingly mobile workforce, how should dashboards adapt to the world of BlackBerries? And that would be if you were lucky enough to have your company standards on the BB platform...maybe there are iPhones and others in the mix.
How do you make information clear, concise and *available* to your audience?
Igor said:
Sorry to say, but I dont like your definition as it is now. Compared to the original, it's too vague on "analytical display", "clear" and "concise". Can you improve on that a bit?
Jorge Camoes said:
I would argue that a dashboard should be compliant with Ben Schneiderman's visual information seeking mantra ("Overview first, zoom and filter, then details-on-demand."). "Overview first" would mean "at a glance". Then the users should be able to zoom and filter and get the details if needed. A sheet of paper is not the best medium to accomplish that...
The question is, what level of details is needed to optimize insights "at a glance" and how can you design the dashboard to create a story with those details.
The one page rule is like Miller's magical number 7+/- 2. It may guide your first steps but soon you discover that what matters is the spirit of the rule, not the rule itself.
Zach said:
@Clint: Here's how I consider the scrolling issue different for a dashboard: 1) when I think of larger than one page, I think of logical mechanisms for drilling in or navigating to more detailed views of information -- not just adding more charts down the page; 2) casual perusing of a website and studying a dashboard have very different mindsets. That is, if the information on the dashboard is critical to running your business, you'll be willing to click around a bit.
@Michael: We've done a "dashboard" for a Blackberry. Basically it was a short text file that offered the most critical daily numbers in a layout that was readable for the device. I think you are right on to ask who is the audience?, how do they want to receive their information?, and therefore what is the right form factor for the dashboard?
@Jorge: I agree that spirit of the rule is in the right place, but it isn't something that you need to slavishly follow.
Jose said:
I suggest that "dashboards" as defined by Few is only one of many views necessary to tell a story: it answers the 1) "What" - as in "Status, Trend/Change or Contribution ". we also need 2) support evidence "Who, When and Where" (Details) and Analysis 3) "Why and How".
In general, even with just a few metrics, I find that explaining just "what" is challenging enough in one page. Think of the financial page of a newspaper: it demands a combination of graphs, tabular info and narrative text. So when creating an information display, I try to organize different views according to their purpose - and make them display/print in one page.
The opportunity that computers give us over paper is the ability to link all different views with common filters - so that the user is able to iterate formulating and answering questions in the display (cycling through Schneiderman's information seeking mantra as many times as needed).
Chris Curran said:
Good points Zach, especially the one you make in the comments regarding understanding audience for a dashboard. In my experience, senior business leaders don't have the time or attention span for a desktop-based UI dashboard. So paper and/or blackberry/mobile must be considered, at least for the "overview first" level of information.
More on my blog at http://www.ciodashboard.com/
Stephen Few said:
Zach,
Your definition differs from mine because we seem to be talking about different things entirely. I define a dashboard as an information display that is used to "monitor" what's going on. You are referring to a display that is used for data analysis or telling a story (two very different forms of data presentation which can't be displayed in the same manner).
A display that's used for regularly monitoring what's going on in an effort to maintain situation awareness requires a much different design than one that's used for data analysis or storytelling. When you're monitoring information for situation awareness, you must see the pieces on a single screen or page to make all the necessary connections and comparisons that are needed to build the big picture in your head of what's going on.
If we want to cut through the confusion that exists regarding proper information display, we must be careful to define our terms carefully and declare our purposes clearly. Multiple pages or screens often work well for telling a story, which much be delivered one piece at a time in the proper sequence. Multiple screens can also work well for analysis as your focus changes from the pursuit of one question to another. Multiple screens do not work well when you need to make comparisons and connections, however, because if the things that must be connected aren't in front of your eyes at the same time, you're forced to rely on working memory, which is extremely limited. In other words, the restriction to a single screen in this case is not arbitrary, but based on scientific evidence of what's actually required to do the job.
jeffrey weir said:
I see there's a good thread that covers some of this at http://www.perceptualedge.com/discussion.htm under New Topic Proposals/Nomenclature for visualization, dashboards, analytic tools, etc.
Zach said:
Stephen,
I can appreciate the distinction between a monitoring tool and an analysis tool. However, I don't think that fully explains our difference of perspective. Even in the case of a monitoring application, a bunch of factors need to be simultaneously optimized to ensure it communicates effectively (e.g. readability, layout and structure, connections and comparison, information design). The one-page constraint elevates the importance of comparison above other factors that have significant impact on the overall success of the dashboard. The constraint has real impacts:
* Tiny fonts and graphics to squeeze in all the information
* Inability to lay out the information to reflect the structure of the business (i.e. show connections)
* Inability to position graphics in ways that support comparison
* All the relevant information has to be shown at once, rather than gradually revealing detail as the user expresses interest.
It is as if you told me that the goal of a new car model is to achieve 40 miles per gallon of gas. It is a fine goal, but it entails sacrifices to comfort, fun, and innovation. You'll never end up with an electric car.
Add a comment
Twitter Analytics for "Analytics"
By Zach Gemignani
March 27, 2009
Find more about:
twitter,
analytics,
twitter
analytics
Twitter’s wild popularity hasn’t obscured the fact that the service needs to eventually make money. The concept of “Twitter analytics” as a revenue stream has come up often enough to make my ears itch and my nose burn.
Twitter’s new business development lead explains that the company is “developing a range of analytics and metrics products and services built around the information contained in tweets”…and “trying to figure out what are the appropriate metrics around engagement and how to convey those.”
Web Strategist Jeremiah Owyang raises the concept of a Twitter CRM solution, in which Twitter would offer their own analytics system to brands, that will help them to track and manage the conversations.
The Twitter ecosystem has responded with a wide range of tools for analysis of Twitter data. Web analytics behemoth Omniture recently announced the integration of Twitter data into their platform. At the same time, web analytics consultant Eric T. Peterson has been vigorously marketing Twitalyzer, a tool to evaluate individuals’ use of Twitter and metrics of influence. Google’s Chrome Experiments released a cool visualization tool called Social Collider that reveals cross-connections between conversations on Twitter. Here are a few more Twitter analytics tools that I’ve run across:
Despite all the activity, I haven’t yet seen a solution that offers the kind of valuable analytics that a company could use to understand the Twitter conversation relevant to their business. The applications above are either focused on the measurement of individual Twitter users or offer a high-level tracking of words and phases in the general conversation. They treat tweets as transactions — How many? How valuable? Who’s listening? Who’s responding?
To me, the great and more rewarding challenge in Twitter analytics is to synthesize the substance of those conversations. Imagine if you went to a party and could overhear everything that everyone else was saying. Who talked the most and who had the greatest audience is less interesting than what topics people were discussing and what was said.
I wanted to take a shot at this type of Twitter analytics.
Analysis Approach
First I had to define a particular domain or topic area. For expediency, I focused on all the tweets that included the word “analytics.” Using the Twitter search API, I pulled the first 500 tweets for each day in March and parsed the results to pull out users, urls, and other characteristics of the tweets.
To analyze the words and phrases being used, I uploaded the resulting 11,300 tweets into Concentrate, our search analytics tool. Concentrate is optimized for search query text (i.e. short phrases without a lot of punctuation). Nevertheless, it has a number of features that make text analysis easier, including breaking out the most common words, phrases and patterns. It also allows for filtering by words to create frequency statistics.
There were two main questions I wanted to address:
- What topics are people discussing?
- What is the structure of the conversation?
Topics of Conversation
The content of the Twitter conversation can be analyzed as words, sites/links, people/groups, and company/products.
Words
I used Concentrate to find the most common words, then I dumped those words into Many Eyes to create this “Wordle-brand” word cloud. Many Eyes has a nice feature that takes out the “common English words.” Clearly Google dominates the conversation, and I even had to artificially reduced the value to make the other words legible.

Below are the top 10 (non-common) words that show up in the analytics conversation

Sites and Links
Twitter has become a mechanism for sharing interesting links (I’ll get to data on that in a bit). Looking at the most popular sites and specific links gives a sense for what people in this community are reading and talking about.

People and Groups
Twitter users have a few conventions for connecting tweets to people or groups:
- ”#” (i.e. hashtag) associates the message with associated with a group, topic or event.
- “RT” (or “via”) is to repeat or “retweet” something someone else has said.
- ”@” associates a tweet with another user, whether retweeting their message or directing a comment to them.
Here are the most common groups and people referenced in the Twitter data.

And the people with the most tweets using the word “analytics”

Companies and Products
I was also interested in what companies or products were referred to most frequently. It is no surprise that Google dominates the conversation. Microsoft gets on the board with the recently closing of their adCenter product. I think we can safely assume they won’t be showing up that often in the future.

Conversation Structure
Beyond the specific content of the conversation, I was also curious about how people who are talking about analytics tend to use Twitter.
Types of Tweets
Eric T. Peterson has four things he considers “signal” (versus “noise”) in the Twitter conversation:
- References to other people (defined by the use of “@” followed by text)
- Links to URLs you can visit (defined by the use of “http://” followed by text)
- Hashtags you can explore and participate with (defined by the use of “#” followed by text)
- Retweets of other people, passing along information (defined by the use of “rt”, “r/t/”, “retweet” or “via”)
While I’m not fond of this definition, examining these different types of tweets (along with question-based tweets) provides a good lens into the nature of the conversation. The following chart shows the percentage of tweets that fall into each of those categories.

It would be all the more interesting if you could follow the types of tweets across time and compare against other topic areas. I suspect that the URL linking within Twitter is on the rise and is turning Twitter into a Delicious-style bookmark sharing service — without the functionality to save, tag, annotate, and view the bookmarks at your leisure.
Link Evolution
Given all the sharing of links, I wanted to get a clearer picture of what happens when a link becomes popular. The graphic below shows some of the top links during the month and the amount they showed up in tweets by day. The red bars represent days where ten or more tweets included the link. A couple links demonstrated popularity over a week or so, but the rest sizzled then disappeared in a day or two.

Activity Distribution
Finally, I took a look at the distribution of users by the number of tweets including the word “analytics.” It was no surprise that the vast majority of the 7,700 twitterers only used the word once in March (of course this doesn’t tell us about their other twittering activity). Obviously there is a small population of people at the core of the discussion.

While you'd have to go into more depth to answer detailed questions, there are a number of interesting take-aways for me, including:
- “Analytics” means “web analytics”, not business intelligence or general reporting about sales, operations, or marketing.
- Google Analytics is the star of the party. Of course, the fact that the brand name includes "analytics" is an advantage, but I didn't see a giant "Juice" in the word cloud.
- Twitter is an echo-chamber. The content clusters around particular subjects, with people retweeting and sharing links about the big news of the day. There are a dozen or so stories that dominated the conversation over this time period.
What’s next?
There are a lot more views of this data that could be enlightening for a company interested having a real-time understanding of their marketplace. For example, it would be interesting to provide more insight into:
- Who is at the center of these conversations?
- What is the positive or negative tone of the discussion (Twitter actually offers this information as part of their API)?
- How has is the conversation changing over time?
- What is the best way to define the boundaries of a domain-specific conversation?
These are the types of questions that I’d like to see addressed in a more complete Twitter analytics tool.
16 comments | Show all comments only the last 5 are shown
Jamie said:
Excellent post thanks! For your link analysis how do you deal with short urls like tinyurl.com?
--
Jamie
Zach said:
We used the API from http://www.longurlplease.com/ . A surprising number of tinyurls couldn't be converted back. Public service: please don't use a tinyurl if you don't have to.
WhizGidget said:
Excellent, and timely, article on Twitter analytics. Tone would be a very interesting slice of the data to look at, as well as how many use Twitter as a substitute instant messaging platform - that is to say, how many back and forth conversations occur between two users, or who has the highest response mechanism.
If a business is going to use Twitter to communicate to the masses, especially using it for customer service then a good question to ask of the data is this: How many tweets are sent to this company as compared to how many responses are made - and are responses made to individuals from the company or are they simply general messages to the public about the state of the business. Yes, that's more on the transactional side than the quality of content side of analysis, but it's important when you're ranking how well a customer communicates using Twitter.
Liviu Taloi said:
Hi, useful post. How did you that graphics. Did you make the analysis by hand or using some sort of tools? I mean for picture 2 - 10. Thanks.
Liviu Taloi said:
I just saw it at the beginning of your article. Nice product.
Zach said:
Liviu, The charts were made in Apple Keynote -- except for the "link evolution" chart which was done in Excel.
Jacob said:
TwitterAnalyzer is the best, one of the most advanced analytic systems ever, all the data you see here can be found in http://Twitteranalyzer.com and much more. words analysis, links, friends location on Google maps, RT analyzer, best friends, disregarded friends, all published pictures,how many twitter users have been exposed to your messages(through RT), and one of the most powerful yet (for my taste) is a real time estimation of the number of your followers currently on twitter.com. very powerful, enjoy the Google Analytics for Twitter!!!
ian farmer said:
Invaluable information, I have commented on other sites that the people making the money in the gold rush were the people selling the picks and shovels, not the gold miners - a survey of "business outcomes" from Twitter adoption in your marketing mix would be great. So many people just "do not get it", insisting Twitter is a fad. We are generating leads which is turning to business. Thanks you for sharing this information.
Zach said:
Jacob, I think you may have missed my point (or may not care): TwitterAnalyzer, Twitalyzer and other such products focus on analytics for the individual, like GA offers for a site. I'm more interested in synthesizing the larger conversation.
Eric T. Peterson said:
Guys,
Regarding synthesizing the larger conversation ... have you had a look at <a href="http://brand.twitalyzer.com">Twitalyzer BRAND</a> yet? It is a new app from Twitalyzer that does exactly what you are talking about ... getting "bigger picture" in Twitter.
Anyway, love your feedback and would love to pick up the conversation about Twitalyzer + Juice now that the APIs are ready. Let me know when you have bandwidth.
Thanks!
E.
Peter Isaksson said:
I think you on to something here. My interest is how to analyze the value of Twitter and other social media? Or should we treat Twitter just like another traffic source and therefore analyze as a trafficsource? To analyze the outcome of a social media strategy could involve things like engagement and revenue. But what i think many tools are trying to do is to analyze the "buzzactivity" and not the outcome of the strategy. Because thats what social media is all about when it comes to marketing a company. A new way to reach out to the target audience, a marketing strategy. Or am I totaly out of the blue here?
Achbar Jones said:
Have you seen tmitter.com? It's great, simple and free!
jeffrey Greenberg said:
you might look at the alpha http://www.tweettronics.com which integrates brand monitoring with influence measures for Twitter.
James Beamish-White said:
You might want to take a look at http://www.twilitics.com, for tracking actual interest in links posted.
Satya Prakash said:
Interesting data
Chris Henry said:
These are all great sites, that are doing really cool stuff with data. One important gap seems to be simple measurement of clickthrough on Tweets. I built http://140ctr.com, hoping to fill that gap. Twitter users just fill in their username, and their CTR will be calculated.
Add a comment
Search Competition Among Travel Sites
By Pete Skomoroch
January 14, 2009
Find more about:
web
analytics
clustering
travel
visualization
patterns
search
sem
longtail
This is a follow up to "Target Long Tail Searches with Keyword Patterns"
To get a sense of the scale of the long tail in search, Dustin Woodard recently put together an analysis of U.S. search data collected by Hitwise over a 3 month period, during which they measured 14 million different search terms. How did these break down?
- Top 100 terms: 5.7% of the all search traffic
- Top 500 terms: 8.9% of the all search traffic
- Top 1,000 terms: 10.6% of the all search traffic
- Top 10,000 terms: 18.5% of the all search traffic
This means if you had a monopoly over the top 1,000 search terms across all search engines (which is impossible), you’d still be missing out on 89.4% of all search traffic. There’s so much traffic in the tail it is hard to even comprehend. To illustrate, if search were represented by a tiny lizard with a one-inch head, the tail of that lizard would stretch for 221 miles.
Yesterday, we described the concept of search patterns and how you can use them to summarize this type of long tail text data. Today, we will walk through a case study we put together to explain how Concentrate's pattern discovery feature will help you find new competitive insights.
You can replicate this study yourself by signing up for the Plus version of Concentrate and loading competitive search data from providers like Hitwise, Compete, Keyword Discovery, or comScore. The input search data used in our analysis consisted of a sample of unique queries leading to clicks on top travel domains during Spring 2006, along with their frequency of occurrence (the chart is truncated after the 20th query):
Raw search data: most frequent queries by site
We loaded the full dataset of queries into Concentrate to generate summary patterns for each of 5 top travel sites. After each file of unique queries and associated metrics is loaded, the application generates reports which include summary statistics based on the head (top 50) and tail queries for each site. This is a good way to start looking at the data if we want to get a sense of each site's long tail search strategy:
Head vs. tail queries for top travel sites
It appears that the long tail makes up the overwhelming majority of traffic for the travel planning and review sites, but is a much smaller percentage for transaction focused sites like Expedia and Travelocity. Measuring the size of the head and tail gives us a rough idea what is going on, but we need to dig deeper if we want to benchmark where we stand in various categories and produce actionable insights. Inspired by a recent New York Times infographic "Words They Used", our data visualization guru, Chris Gemignani, downloaded the Pattern CSV file that Concentrate generated for each of these sites and created the following view of competition in the travel search sector:
Comparing travel searches by pattern
This chart compares the proportion of searches that go to each travel site for the top 25 patterns in the travel sector. The site getting the most traffic for each pattern is highlighted. Only searches that wound up at one of these five travel sites are considered.
The difference in search pattern profiles for these sites is striking. Tripadvisor leads the pack in the long tail, which makes sense given the huge amount of long tail user generated content on the site. TripAdvisor owns most of the pattern categories, but Yahoo Travel and Hotel-Guides take the lead in niche areas like maps and hotels. Traffic to Expedia and Travelocity is largely composed of navigational and branded queries (not shown). The only long tail patterns they have significant share for are "[x] ticket", and "cheap [x]".
The input data we used reflects referrals to these sites from a sample population of users who clicked on search engine result pages. Factors which will affect the number and type of search referrals a site received in this data include: how representative the sample is of the population of U.S. searchers as a whole, how much relevant content a site has for a given query pattern, and how well that content ranks in google and other search engines.
If a travel website repeated this study with Concentrate using current competitive data, then uploaded additional search data for their own site including other metrics beyond search frequency (see our demo using Google Analytics), the results might reveal that "things to do in [x]" queries lead to high quality visits and their site has a chance at winning more searches for that pattern. Based on this information they might decide to make a move on TripAdvisor in that content category. Mark Jackson describes some strategies to apply within the travel sector in an article at Search Engine Watch: Should Your SEO Strategy Target the Head or the Long Tail?. Using Concentrate, a travel website could streamline the process by downloading thousands of real queries for this pattern sent to their competitor:
Some queries in TripAdvisor pattern: "things to do in [x]"
Take Action: Some ideas for next steps
- Use Concentrate patterns to segment your sites query referrals, then estimate revenue opportunities for underperforming long tail categories and target your efforts accordingly
- Expand into new content areas based on trends discovered in long tail search patterns over time and ensure your site's information supply (PDF) matches customer demand
- Get a subscription to Concentrate Plus and repeat this case study for your industry
- Use your industry search patterns as a guide to optimize internal website architecture
- Jumpstart your SEM efforts by using Concentrate to automatically create tight adgroups containing similar queries found within long tail each pattern
- Use your patterns as templates for dynamic filters within your website to show targeted ads, content, related links, or content recommendations based on query referral strings
- If you are a financial analyst or product planner, spot emerging trends in the long tail with Concentrate and use the information to guide decision making
Earlier writing









0 comments | Add a comment
said: