google

Thrown to the Ad-Wolves… or, Learn from My AdWords Mistakes

Here at top-secret Juice headquarters, some major new products are in the works, and we want to promote them with Google’s revenue powerhouse (also known as Google AdWords). Thus, after three weeks of self-imposed AdWords boot camp, I have emerged with a few scrapes and burns, along with some tips that I wish I had been armed with since the beginning.

The natural place to start learning about Google AdWords is the official Help Center, an expansive and neatly categorized resource. But what happens if your inhuman schedule or dwindling coffee supplies don’t allow you the luxury of navigating through the help center hierarchy or sifting through its search results? While you might be able to maintain a semblance of a campaign without answering those lingering questions, you run a high risk of letting potential viewers slip away, never seeing your ad, and wasting money on high CPCs (cost-per-click).

You are hereby invited to learn from my mistakes. I am forgoing the usual basic topics in favor of questions whose answers are more time-consuming and tedious to find. It took me a few weeks to get comfortable with AdWords and figure out these answers myself, but it will only take you a few minutes!

Read on to learn the answers to:

  1. How creative should I be with my ad text?
  2. How do I find out what keywords my competitors are using?
  3. Why has Google’s heartless algorithm condemned my keyword as inactive?
  4. How do I get bolded words in my ad?
  5. What is dynamic keyword insertion, and how do I use it?
  6. What is the difference between a campaign and an ad group?
  7. What is the difference between keywords and placements?

1. How creative should I be with my ad text?

When I was but an AdWords newbie, I held the misconception that creative ads were all that I needed to pull in clicks. Pop psychologists might credit my right brain, starved for attention in the left brain’s home turf (programming! algorithms! programming these algorithms!), for seizing upon the opportunity to design some artistic and imaginative ad copy:

The “Viva la Revolucion” ad was my baby. But it turned out to have a face only a mother could love, as evidenced by the zero people who clicked on it. To the stunned disappointment of my right brain, Google AdWords is just as algorithm-fueled as any of Google’s other products. In fact, Google AdWords runs much like the ubiquitous search engine does, treating your keywords, ads, and landing page similar to the way it treats the 1 trillion pages it crawls while looking for content.

2. How do I find out what keywords my competitors are using?

Google won’t tell you—it’s in their privacy policy. But services such as KeywordSpy will. KeywordSpy not only gives you lists of your competitors’ (and your potential) keywords, but provides data for each keyword about other metrics, including as ROI, price per click, and number of competitors.

3. Why has Google’s heartless algorithm condemned my keyword as inactive?

Sometimes, Google will refuse to show ads for certain keywords unless you pay an absurdly large CPC. The large CPC is meant to discourage you from following any of these bad habits:

  • You dumped a lot of unrelated (or weakly related) keywords into one gigantic ad group.
  • Try making many smaller ad groups, each with its own tightly-connected set of keywords. Ideally, every keyword in a given ad group is a synonym for all the other keywords in the ad group. This also helps tremendously with writing ads that use dynamic keyword insertion (see question #5), since forcing ads to accommodate keywords covering a wide range of topics and/or parts of speech makes the ads vague and unspecific. To find keywords that deserve synonym status, use Google Sets. It’s like a thesaurus on steroids.
  • Your keyword, ads, and landing page aren’t “relevant” enough to each other.
  • All members of the Holy Trinity of content (keywords, ads, and landing page) need to draw from the same words to be considered related. Try making sure that they line up.
  • The cost per click you set for that keyword falls below the minimum.
  • This is the nicer way of saying that you have to spend more money.

4. How do I get bolded words in my ad?

You can’t designate specific words to be bolded (or formatted in any way, for that matter). You can, however, make sure to include keywords (words the user types in that you have selected for your ads) in your ad title and/or body. Just as it bolds keywords in search results, Google bolds keywords in ads. Your keywords do not have to be exact matches with the words in your ad. In the example below, a search for the keyword phrase “report automation” produces an ad that not only bolds “report” and “automation,” but also their variants “reports” and “automating.”

5. What is dynamic keyword insertion, and how do I use it?

This technique (sometimes known as “wildcards”) is how eBay and Target can pull off “Buy _____ now” for every conceivable adjective-noun combination. It allows you to make the same ad apply to multiple keywords. The format is:

The word immediately following the colon (no spaces) indicates the word you want to be shown when the keyword is too long to fit in the ad. Since I chose that word to be “executive dashboards,” the ad prompted by a too-long keyword would look like this:

Here is the same ad with other keywords swapped in, thanks to dynamic keyword insertion:

You can tweak the capitalization of the keyword with Google’s guidance, in the form of this handy table and more.

6. What is the difference between a campaign and an ad group?

A campaign is made up of one or more ad groups. Each campaign has one budget (i.e., $10/day) that is shared between all of its ad groups. Each ad group can be customized with different ad variations, keywords, placements, days and times the ad is shown, etc. Therefore, most modifying and experimenting happens on the ad group level.

7. What is the difference between keywords and placements?

Keywords produce what people usually think of when they think of Google AdWords. When a user performs a Google search for a keyword you have selected, your ad appears on the side (or top, if your budget is very generous) of the results page. Placements occur in the “content network,” which is made of individual sites that get paid to show Google ads. If you sign up for a lot of placements, you’ll get a lot of clicks—but only because of the sheer volume of people seeing your ad. In some ways, placements are less targeted than keywords because people who clicked on your ad in the content network aren’t actively searching, as they are when they find your ad through natural searches. There are two types of placements:

  1. Placements You Select
  2. Google’s Placement Tool allows you to browse a gigantic list of sites organized by topic. Any of these sites could have your ad on it. The Placement Tool will also suggest sites and break down your potential audience by demographic.
  3. Placements Google Selects
  4. Google will select sites in the content network based on information from your current campaign. These sites may make up the bulk of your impressions and clicks on the content network and in general (in other words, clicks from the Google’s selected placements may outnumber both clicks from your selected placements and clicks from organic searches).

This list is by no means a comprehensive examination of AdWords, but at least now you can consider yourself three weeks wiser and three weeks closer to writing one that is.

Programmatic Google Trends API

Updated October 21, 2009

Yesterday, Google released an update to their popular Google Trends tool. There are improvements over the previous version, but the biggest new feature is a new shiny button that lets you download all your data in the format of a CSV. This is a very cool enhancement. Where Google Trends was a geeky toy, it now takes the leap to integrate into analysts’ reports and with that, edge its way onto managerial desks.

This python module is a quasi-API to make it easier to authenticate into Google Trends for those who want to squeeze the extra level of functionality out of their data. The advantage of programmatic access is that the data can be automatically trended and merged. It can be snuck into a 9:00 AM daily email to the VP of Marketing so that she knows to ramp up Google Adwords campaigns for some specific keyword. Also, by programatically pulling multiple reports, it is possible to create a wealth of data not visible in a single report. Using one keyword as a benchmark to merge multiple reports, we can do a meaningful comparison on tens or hundreds of relevant keywords.

To use the pyGTrends, the quasi-Google-Trends-API, you can download the latest version from github.

Here is an example of the most basic basic report that you can pull down from Google Trends. The connector function needs authentication info, and download_report needs to be passed a list of keywords.

[sourcecode language="python" light="true"] from pyGTrends import pyGTrends

connector = pyGTrends('google username','google password') connector.download_report(('keyword1', 'keyword2')) print connector.csv() [/sourcecode]

You can, however, use pyGTrends to get any slice of data that you can pull down from Google Trends. To see the exact parameters that you should use, go to Google Trends, and navigate to the specific sufficiently-narrow report that you are interested in. Then, right-click on the CSV download, and save the link location. The different parameters should be discernible from the link. The following code downloads a report for banana, bread, and bakery keywords from April 2008, originating from the magnificent nation of Austria, and scaled using fixed scaling (aka the second download link).

[sourcecode language="python" light="true"] connector.download_report(('banana', 'bread', 'bakery'), date='2008-4', geo='AT', scale=1) [/sourcecode]

By default, the csv() function downloads the main part of the report, but there are a few additional parts stuck to the bottom of the CSV file. If you are interested in those, pass the section parameter to the csv() function. The following will return the Language section.

[sourcecode language="python" light="true"] print connector.csv(section='Language') [/sourcecode]

Full recommended usage includes using either the csv.reader or csv.DictReader module.

[sourcecode language="python" light="true"] from csv import DictReader print DictReader(connector.csv().split('\n')) [/sourcecode]

Google Trends Eye-Candy

Here is a snapshot from the new Google Trends to add some eye-candy to the post:

Mashing Google Analytics With External Data

A couple months ago, we put together a Greasemonkey tool that sucked data out of Google Analytics, and after mining it for trend information, integrated it back into the GA interface. This week’s tool combines and extends Google Analytics with data from an outside source.

Here is a quick alpha of our Greasemonkey integration of external data reporting into Google Analytics for Kampyle, a "feedback analytics service." Click on the images to zoom in.

Clicking on the ’Kampylize’ tab queries the Kampyle site in real-time to populate the standard GA data table.

Our friends at Kampyle run a service that allows website owners to put a feedback button on individual pages of their website. All information submitted by the user is uploaded to a central Kampyle database that compiles the user feedback with web page url and standard internet statistics such as the name of the browser. Website owners can access a server-end service that consists of a reporting site complete with summary data tables, graphs, and charts.

Since both sites are web-based reporting suites segmented in a similar fashion (individual website, date, web browser, etc.), they integrate together naturally. There is a lot of value in placing related data side by side, allowing users to get a more holistic picture of web site performance. If you have other ideas of data sources that would fit neatly with Google Analytics, let us know and we’ll consider building the integration.

If you’re interested in technical details, continue to Open Juice to see how this is all accomplished...

How Did We Mash Data into Google Analytics?

This post is the code behind how we mashed external data into Google Analytics.

The first step is to yank reference data from the Google Analytics site to reference against Kampyle’s data. We specifically want to gather individual names of websites (index.html, /index2.html), and the current selected daterange. The cell references to the website names in the table can be found using a neat Javascript Shell popular among Greasemonkey and Javascript developers. I will not go into detail about the Javascript Shell, but by checking out the various child nodes for the table object we can track down that document.getElementById(’f_table_data’).childNodes[3].rows[1].cells[1].textContent points at the text in the first cell of the first row. While the syntax looks long, it is just nested HTML in a more elegant programmatic fashion.

For the date, Google Analytics uses a slightly peculiar hybrid system where the date is drawn initially from the URL, but if the date is modified with the java date tool in the upper right hand corner, it uses that instead. From our end, document.getElementById(’f_primaryBegin’).value and document.getElementById(’f_primaryEnd’).value are the java date tool values that only start existing if the date tool is used. Pull these two values if they exist, and simply parse the date from the URL otherwise.

The clickable tab we created is essentially the equivalent of a little Greasemonkey button with a few frills that can be created in the standard Greasemonkey fashion. Wherever possible, I use Google-defined layouts for consistency with the site.

Next, we want to send out our reference data to some external server. Greasemonkey has good functionality for pulling data from other sites and servers through the use of the GM_xmlhttpRequest command. A server-end PHP or Django service might be easiest to implement. In this specific example, Kampyle wanted to use the SOAP protocol. While there is an excellent overall SOAP client for javascript by Matteo Casati, this client does not work in a plug and play fashion with Greasemonkey, and needed some modification. For any devoted SOAPers who want to try Greasemonkey, the revised javascript-soap-client code can be found in the attached file. We use the SHA256 encryption function written by Angel Marin and Paul Johnston, but that is accomplished by just copying and pasting the function into our code.

The result comes back in the form of an xml object describing each row in the table, which we parse using native Javascript/Greasemonkey methods, and pop back into the table in the way that we extracted the individual website names. A neat trick here is to call each individual row individually, and not to wait for the data to come back before calling the next row from the server. Separate listeners can wait and insert the data at their leisure. This allows our page to load up faster, and in case there is an error with one data element, it could potentially allow the rest of the rows to load in peace.

You can play around with my code here. This code is released under the BSD License. You won’t be able to run the code verbatim without Kampyle’s compliance, since they have changed the API calls on their server. However, much of it should be very portable to other data sources.

Juiced Google Analytics Python API

Due to the release of an official Google Analytics Data Export API, this module is now deprecated. We have an alternative python module based upon the real analytics API here, and an exploring tool with an automatic code generation capability here.

It is not official. It is not from Google. It is, however, very functional and very here. I present to you pyGAPI, the Juiced Google Analytics Python API. This module allows you to pull information from your incarnation of Google Analytics and employ it programatically into your reporting code.

Let us use iPython to peek through some code using pyGAPI.

[sourcecode language="python" light="true"] In [3]: from datetime import date In [4]: import pyGAPI In [5]: connector = pyGAPI.pyGAPI(username, password, website_id="1234567")

[/sourcecode]

Here we create a pyGAPI object. Behind the scenes, pyGAPI logs into Google Analytics, and downloads an identifier cookie. website_id is optional. If omitted, pyGAPI accesses the first website on the account’s list. To get a list of all the site IDs to which your site has access, run the function connector.list_sites().

[sourcecode language="python" light="true"] In [6]: connector.download_report('KeywordsReport', (date(2008,3,10), date(2008,3,31)), limit=5)

[/sourcecode]

Download a report into your pyGAPI object. KeywordsReport is the name of the report. It is followed by a tuple containing the start and end dates in python date format. limit is an optional parameter that specifies the number of entries that pyGAPI should pull down. By default, it will pull in all the entries up to a maximum of 10000. Lowering this number will certainly improve performance. The entries returned are ranked by Visits, so you should get the most significant values of the bunch.

[sourcecode language="python" light="true"] In [7]: print connector.csv() Keyword,Visits,Pages/Visit,Avg. Time on Site,% New Visits,Bounce Rate,Visits,Subscribe,Solutions,Goal Conversion Rate,Per Visit Goal Value juice analytics,356,5.935393258426966,314.061797752809,0.38764044642448425,0.29494380950927734,356,1.0,0.16292135417461395,1.1629213094711304,0.0 excel training,142,1.971830985915493,98.0774647887324,0.908450722694397,0.6901408433914185,142,1.0,0.0211267601698637,1.0211267471313477,0.0 excel charts,77,1.7922077922077921,95.0,0.9090909361839294,0.7792207598686218,77,1.0,0.03896103799343109,1.0389610528945923,0.0 excel skills,72,1.6527777777777777,75.29166666666667,0.9444444179534912,0.7083333134651184,72,1.0,0.0,1.0,0.0 colbert bump,70,1.3142857142857143,113.77142857142857,0.6428571343421936,0.8428571224212646,70,1.0,0.0,1.0,0.0

[/sourcecode]

This function displays your report in a nice excel-ready CSV format.

[sourcecode language="python" light="true"] In [8]: print connector.parse_csv_as_dicts(convert_numbers=True) [{'Avg. Time on Site': 314.06179775280901, 'Per Visit Goal Value': 0.0, 'Bounce Rate': 0.29494380950927734, 'Keyword': 'juice analytics', 'Visits': 356.0, 'Pages/Visit': 5.9353932584269664, 'Subscribe': 1.0, 'Solutions': 0.16292135417461395, '% New Visits': 0.38764044642448425, 'Goal Conversion Rate': 1.1629213094711304}, {'Avg. Time on Site': 98.077464788732399, 'Per Visit Goal Value': 0.0, 'Bounce Rate': 0.69014084339141846, 'Keyword': 'excel training', 'Visits': 142.0, 'Pages/Visit': 1.971830985915493, 'Subscribe': 1.0, 'Solutions': 0.021126760169863701, '% New Visits': 0.90845072269439697, 'Goal Conversion Rate': 1.0211267471313477}, {'Avg. Time on Site': 95.0, 'Per Visit Goal Value': 0.0, 'Bounce Rate': 0.77922075986862183, 'Keyword': 'excel charts', 'Visits': 77.0, 'Pages/Visit': 1.7922077922077921, 'Subscribe': 1.0, 'Solutions': 0.038961037993431091, '% New Visits': 0.90909093618392944, 'Goal Conversion Rate': 1.0389610528945923}, {'Avg. Time on Site': 75.291666666666671, 'Per Visit Goal Value': 0.0, 'Bounce Rate': 0.70833331346511841, 'Keyword': 'excel skills', 'Visits': 72.0, 'Pages/Visit': 1.6527777777777777, 'Subscribe': 1.0, 'Solutions': 0.0, '% New Visits': 0.94444441795349121, 'Goal Conversion Rate': 1.0}, {'Avg. Time on Site': 113.77142857142857, 'Per Visit Goal Value': 0.0, 'Bounce Rate': 0.84285712242126465, 'Keyword': 'colbert bump', 'Visits': 70.0, 'Pages/Visit': 1.3142857142857143, 'Subscribe': 1.0, 'Solutions': 0.0, '% New Visits': 0.6428571343421936, 'Goal Conversion Rate': 1.0}]

[/sourcecode]

This function goes the extra step and converts the CSV into a dictionary for easier programmatic use. By default, all entries will be returned as python strings. Setting convert_numbers to True, as we did here, will additionally parse the dictionary to turn all numbers into float values.

[sourcecode language="python" light="true"] In [9]: print connector.list_reports() ('ReferringSourcesReport', 'SearchEnginesReport', 'AllSourcesReport', 'KeywordsReport', 'CampaignsReport', 'AdVersionsReport', 'TopContentReport', 'ContentByTitleReport', 'ContentDrilldownReport', 'EntrancesReport', 'ExitsReport', 'GeoMapReport', 'LanguagesReport', 'HostnamesReport', 'SpeedsReport')

[/sourcecode]

This gets a list of all the reports that I have successfully tested thus far. All site-specific reports should work. A couple site-section specific reports should be included in the next update of pyGAPI.

Google is great and will release a real API soon, but until then you can download pyGAPI.

Keyword Trends in Google Analytics With Greasemonkey

There is a new post that re-releases the script as a Firefox Plugin. Find it here.

After the warm reception for the first version of our Enhanced Google Analytics, we decided to add some new functionality. (Nothing like a few kinds words to keep us in the giving mood.) The first script created a couple new tables in the Google Analytics interface that highlight recent changes in referral visits. It uses Greasemonkey, an add-on for Firefox that allows a user to insert javascript directly into a webpage.

Our update gives you even more ability to understand the data in Google Analytics:

  • At the suggestion of Avinash Kaushik, the new script works for keyword data, helping you see how organic search traffic is changing. An increase in a keyword may indicate a general change in user interests and/or improved performance on search results.
  • My coworker Pete Skomoroch also suggested that I add the ability to see declines in referrals and new keyword searches.
  • With the help of Paul Irish, the script is now better able to interface with the date widget on the Google Analytics site.

$(document).ready(function(){ $(’#qxlfd1’).hide(); $(’#qxlfd2’).hide(); $(’#qxlfd0’).click(function() { if ( $(’#qxlfd1’).is(’:hidden’) ) { $(’#qxlfd1’).slideDown("slow"); $(’#qxlfd2’).slideDown("slow"); } else { $(’#qxlfd1’).slideUp("slow"); $(’#qxlfd2’).slideUp("slow"); } }); });

Who sent me unusual traffic?

(Click the above button for a simulation.)

Keyword Growth
Keyword Decline

When you click the button, your browser will download some historical data behind the scenes, and display a nice summary of the best and worst performing keywords/referring domains.

Installation Instructions:

Firefox 2.0+

Greasemonkey

googleanalyticsdownloade.user.js

If you don’t already have Firefox, install it. Install Greasemonkey, and do the required Firefox restart. You should see a handsome monkey peeking at you from the bottom right hand corner of your browser. Open the script file in your firefox browser, and Greasemonkey should give you an option to install the script.

Afterwards, log into Google Analytics, and navigate to your Referring Sources or Keywords Tab. Click the button.

Configuring the script:

We spent some time trying to find convenient default settings here at Juice Analytics, so the script should work straight out of the box. Some users, however, may find it convenient to alter some of these configurations. To do so, in Firefox, go to Tools=>Greasemonkey=>Manage User Scripts..., select Google Analytics Downloader, and then click Edit in the lower left corner of the window. This should open up the script file in a text editor. If your computer does not have a default text editor configured, you may have to choose one. ’c:\windows\notepad’ is a good bet for Windows machines.

This is what you should see:

Code Blurb

The bracket labeled ’keywords?’ controls defaults for the Keywords page, and correspondingly, ’referring_sources?’ controls the Referring Sources page.

To change the settings, simply change the corresponding variable to your preferred default. Make sure to refresh your Google Analytics webpage, if you have it open, so the new settings are loaded.

Now for the nitty gritty configuration details:

  • display_limit: This controls the maximum entries that each table will contain. This may be useful for large, sprawling sites.
  • growth_tolerance: This is the percentage growth parameter. Changing it to .10, for example, will catch everything that has grown by 10%, as opposed to the default 50% and 20%, respectively.
  • minimum_number_elements: This is a significance benchmark that can be used to limit what is displayed upon the screen. By default, only keywords with at least 10 elements are displayed upon the screen. Referring Sites does not have a minimum by default, but one can be set if desired.
  • limit: Limit is more of an internal parameter that determines how many entries should be downloaded from Google in order to get the results that are visible here on the page. Lower the limit to increase speed. If the limit is set to a very high number, you will get the largest result set, but you will have to sit around for a while for the results to load. Since the results are downloaded ordered by volume, raising the limit from the default numbers will not actually give more significant results. You will simply get more of the smaller results, such as keywords with only 1 hit.
  • look_back: This is a very important parameter. The script uses the date displayed upon your Google Analytics page to determine the full range that you want to consider in your results, but ’look_back’ determines how many of those days are used for the significance test. So, say the range you have displayed in Google is March 23 - April 22 and your look_back is 7 days. The script will compare the average referrals for a given keyword from April 16-22 to the average from March 23-April 15, and will return the keyword only if the recent average is 20% higher than the rest of the time period. Thus, if you want to increase the total range of the data, change the dates on the actual webpage. Change ’look_back’ only if you want to change the period of significance.

Happy analyzing!

Enhancing Google Analytics Using Greasemonkey

There is a new post that re-releases the script as a Firefox Plugin. Find it here.

My boss Zach has a problem. Every four hours the craving strikes him. No matter where he is, he pulls out his shiny Macbook Pro and navigates to Juice’s Google Analytics site. He pulls up the list of referrers to our site and meticulously searches for new domains. He has an freakish ability to pick out IP addresses that have never linked to us before. Even so, there had to be a better way.

I wondered whether Greasemonkey might be able to help. Greasemonkey is an extension for Firefox that allows users to install custom javascript when you visit a specific website. These scripts can add a delete button for Gmail, automatically display lyrics to your YouTube music video, or do pretty much anything else you would want to enhance the functionality of a website.

After poking around the subtleties of the Google Analytics interface, I came up with a little script that can identify the new referrals that Zach so desperately craves. When navigating to the "Referring Sites" section of Google Analytics, the script add the following button to the interface.

Google Analytics Button

Pushing the button downloads all the referrer data for the date displayed in the Google Analytics range, as well as a similar set of data for the range up to, but not including, the last three days. The difference between the two data sources is used to calculate all of the results. The specific number of days can be changed by editing the first line of the script. Greasemonkey then displays the results in two tables above the original Referrer table. (Greasemonkey works entirely within your browser shell, so your data should be quite secure.)

Google Analytics Data

The first table shows any sites that have displayed more than a 50% increase in visits over the last 3 days as compared to the rest of the time range. The second shows all new recent sites that do not appear at all more than 3 days ago. This can be quite useful to anyone, who, like Zach, absolutely needs to know about any new and exciting inbound links.

Installation Instructions:

Firefox 2.0+

Greasemonkey

googleanalyticsdownloade.user.js

If you don’t already have Firefox, install it. Install Greasemonkey, and do the required Firefox restart. You should see a handsome monkey peeking at you from the bottom right hand corner of your browser. Open the script file in your firefox browser, and Greasemonkey should give you an option to install the script.

Afterwards, log into Google Analytics, and navigate to your Referring Sources Tab. Click the button.

Analytics Roundup: TIps for showing, sharing, communicating

Developer’s Guide - Google Chart API - Google Code
Beautiful stuff, particularly the Venn diagram.

Align Journal - BI Worst Practices
We often see articles on BI "Best Practices" here is an article telling us what NOT to do.

flot - Google Code
Attractive Javascript plotting for jQuery.

ongoing · On Communication
Interesting blog post about how different forms of communication rank for immediacy, lifespan, and audience reached.

The Excel Magician: 70+ Excel Tips and Shortcuts to help you make Excel Magic : Codswallop

SlideShare
Source for presentation ideas.

Mapping Phone Data in Four Easy Steps

Have you run into this problem: you have a list of phone numbers and associated values which would be best shown geographically to see patterns, but there isn’t a clear way to put the data on a map. Maybe you’d like to see a map of customer service calls by call duration or inbound sales by average order size.

I wanted to share how to MacGyver a solution with a piece of twine, bubble gum, Excel, and a free online map tool. To me, this is a nice testament to the simple but powerful data visualizations that can be accomplished without programming skills or expensive applications.

1. Pull out area codes

First I pulled the area codes from my list of phone numbers using the formula below. This simply checks if the phone number starts with 1, then grabs the appropriate three digits for the area code.

=VALUE(IF(LEFT(E7,1)="1",MID(E7,2,3),MID(E7,1,3)))

 

2. Convert area codes into states

For my purposes, mapping the phone numbers by state was sufficient. Ideally, we would map the phone numbers to precise latitude and longitude coordinates by doing a reverse lookup of addresses then using the Excel geocoding tool.

First I needed a lookup table that could link my list of area codes to states. I wasn’t able to track down a good data table, so I grabbed the data from All Area Codes and cleaned it up. Here is a lookup table of area codes by state.

An aside: I have a pet peeve with people who sell data that feels like it should be publicly available. You’ll run across these businesses when looking for basic information about ZIP codes, MSAs, or area codes. Here is an example of one of these parasitic businesses.

Zip code product

3. Create your summary data set

I used a pivot table to summarize metrics by state.

4. Create colorized map of the US

Our friend Ducky Sherwood has generously put together a online tool called Mapeteria that will generate a colorized overlay of US states. In Ducky’s words: "Want to make a choropleth thematic map (i.e. coloured based on your data) for Canadian provinces, U.S. states, or French départements?" This overlay can be viewed in either Google Maps or Google Earth.

Here’s where it gets a little tricky. You will need to provide Mapeteria with a URL to a properly structured CSV file. Posting a CSV file to a web server isn’t trivial if you aren’t running your own web site. I found one free service called FileDEN that did the job (other suggestions?). Beware all the advertising—and in all likelihood they immediately sold my e-mail address at registration. Nevertheless, you can upload a file here and it will give you a URL which can be used to create your map.

Here’s an example of the results:

State Map

Google Earth/Google Maps Mashups

Yesterday, Google rolled out new mapping features for Google Earth and Google Maps. Many of these features are behind the scenes in the APIs, but there are great new capabilities that you will start to see. One thing I’m excited about is that KML—Google Earth’s format for building sophisticated map overlays—has come to Google Maps.

Google demoed this at their Geo Developer day yesterday using one of our Google Earth overlays that shows US census bureau data by county mapped as a heatmap. It looks like this.

Counties are displayed in a list on the left. When you click on a county, you get a nice popup showing statistics for that county.

There are a few limitations. Large KML files don’t load in Google Maps, medium-sized files load very slowly—it seems Google is parsing the KML using Javascript.

The mapping toolkits provided by Google Maps, Google Earth, and Yahoo Maps beta are well on their way to becoming important business tools once developers figure out how to wire in your enterprise data.

Without further ado, here are some US census data maps for you to explore in Google Maps.

Population Density

Lighter is higher population density (white is 800+ people per square mile), Dark is lower population density (black is 2 or fewer people per square mile)

AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming

Median Age

Lighter is older median age (white is 46.0 years median age), Dark is younger median age (black is 29.0 years median age)

AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming

Male/Female Ratio

Lighter means more men than women (white is 55% men), Dark means more women than men (black is 45% men)

AlabamaAlaskaArizonaArkansasCaliforniaColoradoConnecticutDelawareDistrict of ColumbiaFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew HampshireNew JerseyNew MexicoNew YorkNorth CarolinaNorth DakotaOhioOklahomaOregonPennsylvaniaRhode IslandSouth CarolinaSouth DakotaTennesseeTexasUtahVermontVirginiaWashingtonWest VirginiaWisconsinWyoming