1. Skip to navigation
  2. Skip to content
  3. Skip to sidebar

Dear Googlers,

I leave this message for you in the case that you encounter a similar obstacle to your productivity. Internet Explorers 6, 7 and 8 do not like uppercase letters (or hyphens) in the object ID when embedding a swf into the page if you then plan on using the code with ExternalInterface. They do not like it with SWFObject or with the simple old fashioned object tag. The solution was elusive, but there is some help on the ghettocooler.net Treasure Trove.

The following code will work in Firefox and Safari, but will mysteriously have a problem in Internet Explorer:

<script language="javascript">
function submitLemon() {
   var swf = document.getElementById("myContent");
   swf.iCanHasParameter('lemon');
}

swfobject.embedSWF("movie.swf", "myContent", "800", 
                   "100%", "9.0.124", null, {}, {bgcolor: '#ffffff'});
</script>

It will however work if you replace the upper case letter with its lowercase equivalent:

<script language="javascript">
function submitLemon() {
   var swf = document.getElementById("mycontent");
   swf.iCanHasParameter('lemon');
}

swfobject.embedSWF("movie.swf", "mycontent", "800", 
                    "100%", "9.0.124", null, {}, {bgcolor: '#ffffff'});
</script>
Topics:



To fly is to be frustrated. If you’ve been traveling for long, you no doubt have your opinions about what airlines and airports are the biggest sources of suffering. Whether it is weather delays, getting stuck on the tarmac due to air traffic, maintenance problems, or missing a connection, it all feels outside of your control.

But a little knowledge can help. The Bureau of Transportations has maintained a giant database of air traffic information for decades of flights — point of origin, flight times, flight delays, type of delay, etc. It is 72 gigabytes of data…just the type of data that needs some visualization. JuiceKit™ to the rescue.

We’ve put together a pair of visualizations that can make this data accessible to your average non-data-monkey traveler:

  • Treemap uses size to represent the number of flights by airline and by point of origin. The color is used to show delay time — we’ve got all sorts of delay metrics, each of which tells an interesting story.

Airline Treemap

  • US map uses size to represent the number of flights and the color to display delay time. Filtering by airline yields additional details.

Airline US Map

There are some interesting insights that pop-out when you build a visualization this data.

  • The different airline strategies are quickly apparent in the treemap. Hub-and-spoke airlines (Delta, Continental) have one or two dominant boxes (origin location), surrounded by lots of small locations. A point-to-point airline like Southwest looks entirely different with lots of similarly sized boxes.

  • Flipping between delay types uncovers some unexpected results. For example, you might expect weather delays to be heavily correlated by airport. The data shows something a little different: Comair appears to be abnormally impacted by weather delays — as if a dark cloud chases around their airplanes. While Comair might be overstating weather delay data to prevent paying for meal vouchers, a more reasonable Wikipedia investigation suggests that Comair flies smaller weather-susceptible Bombardier airplanes.

A few details about this demo for our technical audience:

For those of you following JuiceKit™ development, this is a demo of some of the newer features available in our open source Juicekit™ 1.2 distribution, and some of the features that will be coming to the 1.3 version. Treemap styling is now elegant, crisp, and allows for white borders, fixing a couple rendering bugs. There is a new tree-level depth feature that can make it easier to navigate treemaps with lots of layers. The airports map demonstrates a geographic layout built using GeoLayout JuiceKit™ and Flare components. A major improvement demonstrated by the airline-selector dropdown is the ability to keep nodes consistent between data reloads. This allows us to animate the nodes even though they are generated by our new LiveQuery component.

Topics:
, ,



Update: Thanks for checking this out! However, since Google created and now maintains an updated version of a data feed explorer, we have take the Vasco De Gapi application offline.

Are you ready to explore the Google Analytics API?

At Juice, we were very excited about the public release of the Google Analytics Data Export API. Our product Concentrate has been running on a hackish home-brew Google Analytics export tool since its release last November, and we were happy to be able to relaunch as a Customer Example of the Google Analytics Data Export API.

Today, we are releasing a new, free tool called Vasco de GAPI. Vasco is a web-based tool for exploring the API, for downloading complex slices of data using the API, and to even automatically generate code that will allow coders easy replication of the API calls in question. Instead of describing it in more detail, I am just going to demo it.

I am going to start with a relatively rare but curious functionality of Google Analytics. I keep track of who wrote each blog using a Google Analytics user-defined setting that is set to the author’s name for each specific blog post. Slicing our blog by author can be cool for me as an employee so that I can brag during my yearly review about how many visitors I bring in or what natural search visits we get for free as a result of my posting. For the demo, I’m going to discover the natural keywords that bring traffic to my blogposts on the website.

Let’s get started.

The first step is to authenticate using Google’s OAuth system.

I select ga:keyword as a dimension.

ga:pageviews is the metric I am interested in. The results will automatically get sorted by the first metric, so I do not need to explicitly specify a sort value.

I set ga:userDefinedValue as a filter, and filter it to saluryasev, and select this last week as a reference point.

Here is the list of parameters that Vasco de GAPI is passing to google.

What are my results?

It turns out that of all my posts, the Google Trends API that I put out about a year ago drives the most natural traffic to our site. Hopefully, this will change with a few more blog posts, but this is still rather interesting data. I could target that specific audience with something Google-trendy. On an unrelated note, a slap to my face was that Zach’s name sent fifteen users to my blogposts. Go figure. Sixteen users searched on my last name, and were probably looking for my more popular father.

To get at the rest of the data, I can click the download link at the bottom of the page or, for developers, another link downloads working code that will replicate this exact pull.

Vasco runs using an open source Python gdata wrapper for the API that can be downloaded here. This wrapper is powerful, and I will write another blogpost about it next week. It is plugged into the Google gdata module, and as such allows all forms of authentication available to gdata users, including OAuth, AuthSub, and clientside.

Hopefully, Vasco de GAPI can help all other potential explorers sail smoothly through the API. When it comes to data, Google is just an great company. They have had powerful APIs for most of their major services for years, and while the Analytics API is a latecomer, it actually is more powerful than the analytics interface itself. This sort of openness is something to be envied by all other analytics and web companies in the market.

By the way, please let me know if the explorer theme works well. It was a lot of fun working on a project with a slightly esoteric approach.

Topics:
, ,



There is new life in the tool that shows change in Google Analytics. A year after releasing our Greasemonkey script, we are pleased to release an updated version of the Enhanced Google Analytics script as a free Firefox Plugin. For those already using the older Greasemonkey script, you can skip ahead to the What’s new? and How do I get this plugin? sections of the page. For the rest, you may be wondering: Why does my Google Analytics need change?


Change, and why it is important

When I first started working at Juice Analytics, my boss Zach showed me a part of his daily Google Analytics routine. He would open up the Referring Sites page, glance at all of our 942 referrers. Using his superior intellect and capacity for remembering random urls, Zach would discover interesting deviations in the traffic from sites linking to our blog.

Our top referrers looked more or less similar day to day. Even once you get past the more recognizable top sites such as Twitter and Google, the various somethingblog.com pages, without context, often look a lot like somethingelseblog.com. To top it off, most of the information is not even specifically interesting. Our chartchooser.juiceanalytics.com domain sends us consistent regular referrals, but so what? Day to day, I don’t even really care about Google or Twitter unless something changes. With change, I know whether someone has posted something new about me, sending valuable traffic. A good read on the topic is Avinash’s rant about “actionable analytics“.

Our Firefox plugin is designed to allow analysts to get more action out of what changed in the Referring Sites and Keyword Reports. Here are a couple examples of the plugin in action from our Google Analytics account:


What’s new?

Our focus for this release has been to improve functionality, to reduce the barrier to entry for new users, and to allow automatic updates for the plugin. The new version of the script works nearly instantaneously, and the installation involves only two clicks (in contrast to the 7 clicks of the Greasemonkey version). As a Firefox plugin, updates are now automatic and require no reinstall. Keyword sensitivity has been raised to 50% for consistency. As a slight bonus, the design and layout of the form and buttons is now sleeker and the table stands out in a pretty Google blue.

Greasemonkey itself is no longer required for the plugin, but you may want to keep it around for any of the other cool scripts available from the community. If you ever find yourself wishing that something about the web looked different, acted different or had different functionality, there may be a Greasemonkey script to ease your pain.


How do I get this plugin?

First, you need Firefox 2.0+.

If you are a user of the equivalent older Greasemonkey version of this script, you may want to go ahead and uninstall it. Go to Tools=>Greasemonkey=>Manage User Scripts…, select Google Analytics Downloader, and uncheck the Enabled box.

If you never had the script installed, or once you removed it, simply click here to go the mozilla addon site, select the checkbox and click the button. Once installed, navigate to Google Analytics, and go to either the Referring Sites or Keyword pages, and click the blue button.

Happy analyzing!

Topics:
,



Over the past couple months, I have delved into the world of jquery, which, for the uninitiated, is the framework for developing javascript in today’s world of web development. Jquery shortens or eliminates many of the more tedious aspects of HTML’s dom framework, while allowing elegant access to the fancier effects of javascript. One of the other coolest parts of this framework is the growing number of plugins built by the community.

The plugin that I want to bring up is the history plugin by Klaus Hartl, the original of which can be found here.
Javascript usually has problems with the back button, but here it is integrated smoothly, and without forcing the page to do an ugly reload. As an added bonus, the plugin also allows the creation of links that are tied to an internal javascript state. Basically, javascript would now have the abilities of a full-blown 90s style html-only webpage, with the smoothness and efficiency of not requiring any page reloads. A good example of a similar javascript history module in action is Gmail, where the back button works smoothly, and you are able to copy and paste the url to a different window.

My slightly modified version of the script can be found here. This potentially interesting enhancement allows the use of a regular expression when defining the javascript internals for the history plugin. It is useful when the site has additional user-specific variables, such as an id or an object name or number, and the programmer has no desire to explicitly code every possibly imaginable scenario. Regular expressions are very powerful tools.

To use the modified version of the module, use a snippet like this one, substituting a regex for where the history module would normally require a string:

    $.ui.history('add', '(\\d+)/analyze/queries', function(url, pg) {
        set_phrasegroup(pg);
        //do something something
    });

Note the slight irregularity where you need to escape each slash with another slash.

The first parameter, url as passed into the function is the entire string that the user inputed. It would look something like 3/analyze/queries, while pg, any other further passed parameters refer to any of the matched groups within the regex. In this case, we only have one, (\d+), a number.

Topics:



Updated October 21, 2009

Yesterday, Google released an update to their popular Google Trends tool. There are improvements over the previous version, but the biggest new feature is a new shiny button that lets you download all your data in the format of a CSV. This is a very cool enhancement. Where Google Trends was a geeky toy, it now takes the leap to integrate into analysts’ reports and with that, edge its way onto managerial desks.

This python module is a quasi-API to make it easier to authenticate into Google Trends for those who want to squeeze the extra level of functionality out of their data. The advantage of programmatic access is that the data can be automatically trended and merged. It can be snuck into a 9:00 AM daily email to the VP of Marketing so that she knows to ramp up Google Adwords campaigns for some specific keyword. Also, by programatically pulling multiple reports, it is possible to create a wealth of data not visible in a single report. Using one keyword as a benchmark to merge multiple reports, we can do a meaningful comparison on tens or hundreds of relevant keywords.

To use the pyGTrends, the quasi-Google-Trends-API, you can download the latest version from github.

Here is an example of the most basic basic report that you can pull down from Google Trends. The connector function needs authentication info, and download_report needs to be passed a list of keywords.

from pyGTrends import pyGTrends

connector = pyGTrends('google username','google password')
connector.download_report(('keyword1', 'keyword2'))
print connector.csv()

You can, however, use pyGTrends to get any slice of data that you can pull down from Google Trends. To see the exact parameters that you should use, go to Google Trends, and navigate to the specific sufficiently-narrow report that you are interested in. Then, right-click on the CSV download, and save the link location. The different parameters should be discernible from the link. The following code downloads a report for banana, bread, and bakery keywords from April 2008, originating from the magnificent nation of Austria, and scaled using fixed scaling (aka the second download link).

connector.download_report(('banana', 'bread', 'bakery'),
                          date='2008-4',
                          geo='AT',
                          scale=1)

By default, the csv() function downloads the main part of the report, but there are a few additional parts stuck to the bottom of the CSV file. If you are interested in those, pass the section parameter to the csv() function. The following will return the Language section.

print connector.csv(section='Language')

Full recommended usage includes using either the csv.reader or csv.DictReader module.

from csv import DictReader
print DictReader(connector.csv().split('\n'))

Here is a snapshot from the new Google Trends to add some eye-candy to the post:
Google Trends Eye-Candy

Topics:
, , ,



A couple months ago, we put together a Greasemonkey tool that sucked data out of Google Analytics, and after mining it for trend information, integrated it back into the GA interface. This week’s tool combines and extends Google Analytics with data from an outside source.

Here is a quick alpha of our Greasemonkey integration of external data reporting into Google Analytics for Kampyle, a “feedback analytics service.” Click on the images to zoom in.

Clicking on the ’Kampylize’ tab queries the Kampyle site in real-time to populate the standard GA data table.


Our friends at Kampyle run a service that allows website owners to put a feedback button on individual pages of their website. All information submitted by the user is uploaded to a central Kampyle database that compiles the user feedback with web page url and standard internet statistics such as the name of the browser. Website owners can access a server-end service that consists of a reporting site complete with summary data tables, graphs, and charts.

Since both sites are web-based reporting suites segmented in a similar fashion (individual website, date, web browser, etc.), they integrate together naturally. There is a lot of value in placing related data side by side, allowing users to get a more holistic picture of web site performance. If you have other ideas of data sources that would fit neatly with Google Analytics, let us know and we’ll consider building the integration.

If you’re interested in technical details, continue to Open Juice to see how this is all accomplished…

Topics:
, ,



This post is the code behind how we mashed external data into Google Analytics.

The first step is to yank reference data from the Google Analytics site to reference against Kampyle’s data. We specifically want to gather individual names of websites (index.html, /index2.html), and the current selected daterange. The cell references to the website names in the table can be found using a neat Javascript Shell popular among Greasemonkey and Javascript developers. I will not go into detail about the Javascript Shell, but by checking out the various child nodes for the table object we can track down that document.getElementById(’f_table_data’).childNodes[3].rows[1].cells[1].textContent points at the text in the first cell of the first row. While the syntax looks long, it is just nested HTML in a more elegant programmatic fashion.

For the date, Google Analytics uses a slightly peculiar hybrid system where the date is drawn initially from the URL, but if the date is modified with the java date tool in the upper right hand corner, it uses that instead. From our end, document.getElementById(’f_primaryBegin’).value and document.getElementById(’f_primaryEnd’).value are the java date tool values that only start existing if the date tool is used. Pull these two values if they exist, and simply parse the date from the URL otherwise.

The clickable tab we created is essentially the equivalent of a little Greasemonkey button with a few frills that can be created in the standard Greasemonkey fashion. Wherever possible, I use Google-defined layouts for consistency with the site.

Next, we want to send out our reference data to some external server. Greasemonkey has good functionality for pulling data from other sites and servers through the use of the GM_xmlhttpRequest command. A server-end PHP or Django service might be easiest to implement. In this specific example, Kampyle wanted to use the SOAP protocol. While there is an excellent overall SOAP client for javascript by Matteo Casati, this client does not work in a plug and play fashion with Greasemonkey, and needed some modification. For any devoted SOAPers who want to try Greasemonkey, the revised javascript-soap-client code can be found in the attached file. We use the SHA256 encryption function written by Angel Marin and Paul Johnston, but that is accomplished by just copying and pasting the function into our code.

The result comes back in the form of an xml object describing each row in the table, which we parse using native Javascript/Greasemonkey methods, and pop back into the table in the way that we extracted the individual website names. A neat trick here is to call each individual row individually, and not to wait for the data to come back before calling the next row from the server. Separate listeners can wait and insert the data at their leisure. This allows our page to load up faster, and in case there is an error with one data element, it could potentially allow the rest of the rows to load in peace.

You can play around with my code here. This code is released under the BSD License. You won’t be able to run the code verbatim without Kampyle’s compliance, since they have changed the API calls on their server. However, much of it should be very portable to other data sources.

Topics:
,



Last week, we shared a rendition of a Tufte graphic using just a few lines of Nodebox code. As our commenters pointed out, Python is great, but it may not be every business analyst’s carnal desire to learn a programming language just to generate some nifty graphs. I spent some time to push Chris’s Nodebox rendition into a PIL-based Windows tool that can generate the same sort of comparison graph from an Excel file on the fly.

The result is The Comparison Chart Generator 1.0. The installation instructions are relatively simple. Unzip the zip file, and run comparisionchartgenerator.exe.

Alternatively, we have a new excel chart that creates the same effect using only excel functionality. Download the Excel Tufte Line Chart here.

If you are using the Chart Generator, start with some data in an Excel (xls) or Comma Delimited (csv) format. The data for this graph has to be contained within the first sheet starting with cell A1, as in the following picture.

Excel Dialog

Select an input file. There are a couple example files bundled with the download.

Open File Dialog

After selecting a file, you’ll be prompted to modify a few of the basic options available for the chart.

Options Dialog

Finally, save the result as a jpeg.

Save File Dialog

Here is the same image found in Tufte’s textbook processed using the Comparison Chart Generator. It is generated using the csv example file bundled with the download.

Tufte-esque Chart by Comparison Chart Generator

Those of us who have undergone lasik eye-improvement surgery may still prefer the sharp crisp Nodebox results, but for the rest of us, this image looks pretty good. Let us know if this tool is useful. If there is enough of a positive response, we may consider expanding functionality for other fancy Tufte-esque charts.

If you do prefer Nodebox, I have an updated script here. This pushes the script up to 20 lines of code or so, but the extra 9 lines allow the labels to push themselves apart on their own. If you want to look at the source code for the Windows program, you can get it here. I used py2exe to compile it into an executable. The code, however, has not been thoroughly commented or cleaned as of yet, so edit it at your own risk.

Topics:
, , , ,



Due to the release of an official Google Analytics Data Export API, this module is now deprecated. We have an alternative python module based upon the real analytics API here, and an exploring tool with an automatic code generation capability here.

It is not official. It is not from Google. It is, however, very functional and very here. I present to you pyGAPI, the Juiced Google Analytics Python API. This module allows you to pull information from your incarnation of Google Analytics and employ it programatically into your reporting code.

Let us use iPython to peek through some code using pyGAPI.

In [3]: from datetime import date
In [4]: import pyGAPI
In [5]: connector = pyGAPI.pyGAPI(username, password, website_id="1234567")

Here we create a pyGAPI object. Behind the scenes, pyGAPI logs into Google Analytics, and downloads an identifier cookie. website_id is optional. If omitted, pyGAPI accesses the first website on the account’s list. To get a list of all the site IDs to which your site has access, run the function connector.list_sites().

In [6]: connector.download_report('KeywordsReport', 
          (date(2008,3,10), date(2008,3,31)), 
          limit=5)

Download a report into your pyGAPI object. KeywordsReport is the name of the report. It is followed by a tuple containing the start and end dates in python date format. limit is an optional parameter that specifies the number of entries that pyGAPI should pull down. By default, it will pull in all the entries up to a maximum of 10000. Lowering this number will certainly improve performance. The entries returned are ranked by Visits, so you should get the most significant values of the bunch.

In [7]: print connector.csv()
Keyword,Visits,Pages/Visit,Avg. Time on Site,% New Visits,Bounce Rate,Visits,Subscribe,Solutions,Goal Conversion Rate,Per Visit Goal Value
juice analytics,356,5.935393258426966,314.061797752809,0.38764044642448425,0.29494380950927734,356,1.0,0.16292135417461395,1.1629213094711304,0.0
excel training,142,1.971830985915493,98.0774647887324,0.908450722694397,0.6901408433914185,142,1.0,0.0211267601698637,1.0211267471313477,0.0
excel charts,77,1.7922077922077921,95.0,0.9090909361839294,0.7792207598686218,77,1.0,0.03896103799343109,1.0389610528945923,0.0
excel skills,72,1.6527777777777777,75.29166666666667,0.9444444179534912,0.7083333134651184,72,1.0,0.0,1.0,0.0
colbert bump,70,1.3142857142857143,113.77142857142857,0.6428571343421936,0.8428571224212646,70,1.0,0.0,1.0,0.0

This function displays your report in a nice excel-ready CSV format.

In [8]: print connector.parse_csv_as_dicts(convert_numbers=True)
[{'Avg. Time on Site': 314.06179775280901,
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.29494380950927734, 
  'Keyword': 'juice analytics', 'Visits': 356.0, 
  'Pages/Visit': 5.9353932584269664,
  'Subscribe': 1.0,
  'Solutions': 0.16292135417461395,
  '% New Visits': 0.38764044642448425, 
  'Goal Conversion Rate': 1.1629213094711304}, 
 {'Avg. Time on Site': 98.077464788732399,
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.69014084339141846, 
  'Keyword': 'excel training', 
  'Visits': 142.0, 
  'Pages/Visit': 1.971830985915493, 
  'Subscribe': 1.0,
  'Solutions': 0.021126760169863701, 
  '% New Visits': 0.90845072269439697, 
  'Goal Conversion Rate': 1.0211267471313477}, 
 {'Avg. Time on Site': 95.0, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.77922075986862183, 
  'Keyword': 'excel charts', 
  'Visits': 77.0, 
  'Pages/Visit': 1.7922077922077921, 
  'Subscribe': 1.0, 
  'Solutions': 0.038961037993431091, 
  '% New Visits': 0.90909093618392944, 
  'Goal Conversion Rate': 1.0389610528945923}, 
 {'Avg. Time on Site': 75.291666666666671, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.70833331346511841, 
  'Keyword': 'excel skills', 
  'Visits': 72.0, 
  'Pages/Visit': 1.6527777777777777, 
  'Subscribe': 1.0, 
  'Solutions': 0.0, 
  '% New Visits': 0.94444441795349121, 
  'Goal Conversion Rate': 1.0}, 
 {'Avg. Time on Site': 113.77142857142857, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.84285712242126465, 
  'Keyword': 'colbert bump', 
  'Visits': 70.0, 
  'Pages/Visit': 1.3142857142857143,
  'Subscribe': 1.0, 
  'Solutions': 0.0, 
  '% New Visits': 0.6428571343421936, 
  'Goal Conversion Rate': 1.0}]

This function goes the extra step and converts the CSV into a dictionary for easier programmatic use. By default, all entries will be returned as python strings. Setting convert_numbers to True, as we did here, will additionally parse the dictionary to turn all numbers into float values.

In [9]: print connector.list_reports()
('ReferringSourcesReport', 
 'SearchEnginesReport', 
 'AllSourcesReport', 
 'KeywordsReport', 
 'CampaignsReport', 
 'AdVersionsReport', 
 'TopContentReport', 
 'ContentByTitleReport',
 'ContentDrilldownReport', 
 'EntrancesReport', 
 'ExitsReport', 
 'GeoMapReport', 
 'LanguagesReport', 
 'HostnamesReport', 
 'SpeedsReport')

This gets a list of all the reports that I have successfully tested thus far. All site-specific reports should work. A couple site-section specific reports should be included in the next update of pyGAPI.

Google is great and will release a real API soon, but until then you can download pyGAPI.

Topics:
, , , ,



Page 1 of 212