search

Enhanced Google Analytics: Firefox Plugin

There is new life in the tool that shows change in Google Analytics. A year after releasing our Greasemonkey script, we are pleased to release an updated version of the Enhanced Google Analytics script as a free Firefox Plugin. For those already using the older Greasemonkey script, you can skip ahead to the What’s new? and How do I get this plugin? sections of the page. For the rest, you may be wondering: Why does my Google Analytics need change?

Change, and why it is important

When I first started working at Juice Analytics, my boss Zach showed me a part of his daily Google Analytics routine. He would open up the Referring Sites page, glance at all of our 942 referrers. Using his superior intellect and capacity for remembering random urls, Zach would discover interesting deviations in the traffic from sites linking to our blog.

Our top referrers looked more or less similar day to day. Even once you get past the more recognizable top sites such as Twitter and Google, the various somethingblog.com pages, without context, often look a lot like somethingelseblog.com. To top it off, most of the information is not even specifically interesting. Our chartchooser.juiceanalytics.com domain sends us consistent regular referrals, but so what? Day to day, I don’t even really care about Google or Twitter unless something changes. With change, I know whether someone has posted something new about me, sending valuable traffic. A good read on the topic is Avinash’s rant about "actionable analytics".

Our Firefox plugin is designed to allow analysts to get more action out of what changed in the Referring Sites and Keyword Reports. Here are a couple examples of the plugin in action from our Google Analytics account:

What’s new?

Our focus for this release has been to improve functionality, to reduce the barrier to entry for new users, and to allow automatic updates for the plugin. The new version of the script works nearly instantaneously, and the installation involves only two clicks (in contrast to the 7 clicks of the Greasemonkey version). As a Firefox plugin, updates are now automatic and require no reinstall. Keyword sensitivity has been raised to 50% for consistency. As a slight bonus, the design and layout of the form and buttons is now sleeker and the table stands out in a pretty Google blue.

Greasemonkey itself is no longer required for the plugin, but you may want to keep it around for any of the other cool scripts available from the community. If you ever find yourself wishing that something about the web looked different, acted different or had different functionality, there may be a Greasemonkey script to ease your pain.

How do I get this plugin?

First, you need Firefox 2.0+.

If you are a user of the equivalent older Greasemonkey version of this script, you may want to go ahead and uninstall it. Go to Tools=>Greasemonkey=>Manage User Scripts..., select Google Analytics Downloader, and uncheck the Enabled box.

If you never had the script installed, or once you removed it, simply click here to go the mozilla addon site, select the checkbox and click the button. Once installed, navigate to Google Analytics, and go to either the Referring Sites or Keyword pages, and click the blue button.

Happy analyzing!

Search Competition Among Travel Sites

This is a follow up to "Target Long Tail Searches with Keyword Patterns"

To get a sense of the scale of the long tail in search, Dustin Woodard recently put together an analysis of U.S. search data collected by Hitwise over a 3 month period, during which they measured 14 million different search terms. How did these break down?

  • Top 100 terms: 5.7% of the all search traffic
  • Top 500 terms: 8.9% of the all search traffic
  • Top 1,000 terms: 10.6% of the all search traffic
  • Top 10,000 terms: 18.5% of the all search traffic

This means if you had a monopoly over the top 1,000 search terms across all search engines (which is impossible), you’d still be missing out on 89.4% of all search traffic. There’s so much traffic in the tail it is hard to even comprehend. To illustrate, if search were represented by a tiny lizard with a one-inch head, the tail of that lizard would stretch for 221 miles.

Yesterday, we described the concept of search patterns and how you can use them to summarize this type of long tail text data. Today, we will walk through a case study we put together to explain how Concentrate’s pattern discovery feature will help you find new competitive insights.

You can replicate this study yourself by signing up for the Plus version of Concentrate and loading competitive search data from providers like Hitwise, Compete, Keyword Discovery, or comScore. The input search data used in our analysis consisted of a sample of unique queries leading to clicks on top travel domains during Spring 2006, along with their frequency of occurrence (the chart is truncated after the 20th query):

Raw search data: most frequent queries by site
unique search queries for travel sites

We loaded the full dataset of queries into Concentrate to generate summary patterns for each of 5 top travel sites. After each file of unique queries and associated metrics is loaded, the application generates reports which include summary statistics based on the head (top 50) and tail queries for each site. This is a good way to start looking at the data if we want to get a sense of each site’s long tail search strategy:

Head vs. tail queries for top travel sites

head vs tail for travel searches

It appears that the long tail makes up the overwhelming majority of traffic for the travel planning and review sites, but is a much smaller percentage for transaction focused sites like Expedia and Travelocity. Measuring the size of the head and tail gives us a rough idea what is going on, but we need to dig deeper if we want to benchmark where we stand in various categories and produce actionable insights. Inspired by a recent New York Times infographic "Words They Used", our data visualization guru, Chris Gemignani, downloaded the Pattern CSV file that Concentrate generated for each of these sites and created the following view of competition in the travel search sector:

Comparing travel searches by pattern
long tail query patterns from Concentrate

This chart compares the proportion of searches that go to each travel site for the top 25 patterns in the travel sector. The site getting the most traffic for each pattern is highlighted. Only searches that wound up at one of these five travel sites are considered.

The difference in search pattern profiles for these sites is striking. Tripadvisor leads the pack in the long tail, which makes sense given the huge amount of long tail user generated content on the site. TripAdvisor owns most of the pattern categories, but Yahoo Travel and Hotel-Guides take the lead in niche areas like maps and hotels. Traffic to Expedia and Travelocity is largely composed of navigational and branded queries (not shown). The only long tail patterns they have significant share for are "[x] ticket", and "cheap [x]".

The input data we used reflects referrals to these sites from a sample population of users who clicked on search engine result pages. Factors which will affect the number and type of search referrals a site received in this data include: how representative the sample is of the population of U.S. searchers as a whole, how much relevant content a site has for a given query pattern, and how well that content ranks in google and other search engines.

If a travel website repeated this study with Concentrate using current competitive data, then uploaded additional search data for their own site including other metrics beyond search frequency (see our demo using Google Analytics), the results might reveal that "things to do in [x]" queries lead to high quality visits and their site has a chance at winning more searches for that pattern. Based on this information they might decide to make a move on TripAdvisor in that content category. Mark Jackson describes some strategies to apply within the travel sector in an article at Search Engine Watch: Should Your SEO Strategy Target the Head or the Long Tail?. Using Concentrate, a travel website could streamline the process by downloading thousands of real queries for this pattern sent to their competitor:

Some queries in TripAdvisor pattern: "things to do in [x]"
long tail travel search pattern

Take Action: Some ideas for next steps

Target Long Tail Searches with Keyword Patterns

On Friday, we launched our new keyword tool: Concentrate. One of its key features is a scalable algorithm that automatically discovers patterns in large amounts of search data and clusters long tail queries into manageable groups. This post will explain how using Concentrate’s pattern discovery feature can simplify search data analysis and give you an edge on the competition. To explain how valuable Concentrate’s pattern discovery can be, we put together a case study of the travel sector using the Plus version of Concentrate and the type of competitive search data available from commercial providers like Hitwise or Compete. We will go into the details tomorrow, but here is a sneak peek at the results. This chart shows the share of travel searches by site in Spring 2006 and was generated using reports downloaded from Concentrate pattern discovery:

Travel Sector Searches: Comparing sites by pattern share

long tail query patterns from Concentrate

The Long Tail of Search

Search analytics starts by looking at the most frequent search queries driving traffic to your site or that of your competitors (these are often called the "head queries"). For most sites, these queries are a fraction of your total search traffic and just the tip of the iceberg in terms of insight about your audience. Queries like "cheap hotels in liverpool ny" may only occur once or twice in a given month, but when aggregated with other rare phrases can make up the bulk of your traffic.

The concept of the long tail in business intelligence has been a topic of debate over the last few years. One area where the long tail is alive and well is in search. The landscape of user search queries is dominated by the long tail, and most studies indicate that referrals from these long tail phrases are more likely to lead to purchases on your site. Natural search isn’t the only area where the long tail turns out to be critical. Paid search efforts which ignore the long tail are potentially missing out on a large chunk of revenue. The challenge of the long tail is that dealing with massive amounts of query data quickly becomes unmanageable.

Traditional Search Reports: head queries for some top travel sites
traditional search keyword reports

If you have hundreds of pages of unique queries to sort through manually, forming a actionable view of that data is a painful process. This is why most people only look at the first few pages of queries.

Categorizing Queries using Patterns

Finding frequent search patterns is the key to making search data understandable. Patterns let you to treat groups of long tail searches like popular individual queries.

Our concept of patterns is similar to an example described by Brian Brown in a recent SEOMoz post. Patterns are templates for searches that have a similar structure. For instance, the pattern “jobs in [x]" represents searches for jobs in some location. The “[x]" is a wildcard that can stand for one or more words. These “masked terms" are often variants of a similar concepts, like locations or celebrity names. Depending on the nature of your site, up to 80% of your long tail search traffic could be summarized using just the top 20 query patterns.

Concentrate Pattern Summary View for TripAdvisor.com
Example of Concentrate search pattern view

The next iteration of Concentrate’s learning algorithms will replace many of these wildcards with named entity labels. For example: “hotels in [x]" will become “hotels in [City]". See our FAQ for more details on special pattern categories like navigational queries. Tomorrow, we’ll cover the travel case study in detail.

Introducing Concentrate for Long Tail Search Analytics

We are pleased thrilled to introduce Concentrate™, an innovative long-tail keyword tool. Concentrate is for SEO and paid search professionals who want to make sense of search keyword data and make the most of search investments.

Check out the demo here. Or try out the free version here (you’ll need admin access to a Google Analytics account).

We built Concentrate because we saw a fundamental conflict in the world of search analysis: On the one hand, search keyword data is terrifically interesting and valuable. It can tell you what your visitors and customers want and how they think about you and your products.

Juice Analytics keywords

Unfortunately, search query data is also big, messy, and hard to get your hands around. In a typical month, the Juice site gets over 10,000 visits from over 7,000 unique keywords.

Even if I could somehow wrap my head around our top 100 keywords, I’d only understand 25% of the visits. For people spending money on search engine optimization or paid search campaigns, that’s a big blind-spot to accept.

We want you to understand and act on all your search data. Concentrate ingests data from sources that most sites already have available (e.g Google Analytics, Omniture, Coremetrics, Hitwise, Compete, etc.), enhances this data by finding common patterns and query types, and visualizes search phrases for exploration and analysis.

Over the next couple of weeks, we will share examples of some of the interesting things you can do with Concentrate, including:

Pattern identification to condense the long tail into keyword phrases with similar structures. For example, here are some common search patterns from a cooking web site (the “[x]” represents a wildcard).

Patterns

Keyword visualization to show the connections between keywords and the relative performance of phrases. This wordtree shows the frequency of words within phrases (size) and average time spent on site (color).

Wordtree

Congratulations to Chris, Pete, and Sal for all their hard work, diligence, and creative problem solving to launch this solution.