1. Skip to navigation
  2. Skip to content
  3. Skip to sidebar

Due to the release of an official Google Analytics Data Export API, this module is now deprecated. We have an alternative python module based upon the real analytics API here, and an exploring tool with an automatic code generation capability here.

It is not official. It is not from Google. It is, however, very functional and very here. I present to you pyGAPI, the Juiced Google Analytics Python API. This module allows you to pull information from your incarnation of Google Analytics and employ it programatically into your reporting code.

Let us use iPython to peek through some code using pyGAPI.

In [3]: from datetime import date
In [4]: import pyGAPI
In [5]: connector = pyGAPI.pyGAPI(username, password, website_id="1234567")

Here we create a pyGAPI object. Behind the scenes, pyGAPI logs into Google Analytics, and downloads an identifier cookie. website_id is optional. If omitted, pyGAPI accesses the first website on the account’s list. To get a list of all the site IDs to which your site has access, run the function connector.list_sites().

In [6]: connector.download_report('KeywordsReport', 
          (date(2008,3,10), date(2008,3,31)), 
          limit=5)

Download a report into your pyGAPI object. KeywordsReport is the name of the report. It is followed by a tuple containing the start and end dates in python date format. limit is an optional parameter that specifies the number of entries that pyGAPI should pull down. By default, it will pull in all the entries up to a maximum of 10000. Lowering this number will certainly improve performance. The entries returned are ranked by Visits, so you should get the most significant values of the bunch.

In [7]: print connector.csv()
Keyword,Visits,Pages/Visit,Avg. Time on Site,% New Visits,Bounce Rate,Visits,Subscribe,Solutions,Goal Conversion Rate,Per Visit Goal Value
juice analytics,356,5.935393258426966,314.061797752809,0.38764044642448425,0.29494380950927734,356,1.0,0.16292135417461395,1.1629213094711304,0.0
excel training,142,1.971830985915493,98.0774647887324,0.908450722694397,0.6901408433914185,142,1.0,0.0211267601698637,1.0211267471313477,0.0
excel charts,77,1.7922077922077921,95.0,0.9090909361839294,0.7792207598686218,77,1.0,0.03896103799343109,1.0389610528945923,0.0
excel skills,72,1.6527777777777777,75.29166666666667,0.9444444179534912,0.7083333134651184,72,1.0,0.0,1.0,0.0
colbert bump,70,1.3142857142857143,113.77142857142857,0.6428571343421936,0.8428571224212646,70,1.0,0.0,1.0,0.0

This function displays your report in a nice excel-ready CSV format.

In [8]: print connector.parse_csv_as_dicts(convert_numbers=True)
[{'Avg. Time on Site': 314.06179775280901,
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.29494380950927734, 
  'Keyword': 'juice analytics', 'Visits': 356.0, 
  'Pages/Visit': 5.9353932584269664,
  'Subscribe': 1.0,
  'Solutions': 0.16292135417461395,
  '% New Visits': 0.38764044642448425, 
  'Goal Conversion Rate': 1.1629213094711304}, 
 {'Avg. Time on Site': 98.077464788732399,
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.69014084339141846, 
  'Keyword': 'excel training', 
  'Visits': 142.0, 
  'Pages/Visit': 1.971830985915493, 
  'Subscribe': 1.0,
  'Solutions': 0.021126760169863701, 
  '% New Visits': 0.90845072269439697, 
  'Goal Conversion Rate': 1.0211267471313477}, 
 {'Avg. Time on Site': 95.0, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.77922075986862183, 
  'Keyword': 'excel charts', 
  'Visits': 77.0, 
  'Pages/Visit': 1.7922077922077921, 
  'Subscribe': 1.0, 
  'Solutions': 0.038961037993431091, 
  '% New Visits': 0.90909093618392944, 
  'Goal Conversion Rate': 1.0389610528945923}, 
 {'Avg. Time on Site': 75.291666666666671, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.70833331346511841, 
  'Keyword': 'excel skills', 
  'Visits': 72.0, 
  'Pages/Visit': 1.6527777777777777, 
  'Subscribe': 1.0, 
  'Solutions': 0.0, 
  '% New Visits': 0.94444441795349121, 
  'Goal Conversion Rate': 1.0}, 
 {'Avg. Time on Site': 113.77142857142857, 
  'Per Visit Goal Value': 0.0, 
  'Bounce Rate': 0.84285712242126465, 
  'Keyword': 'colbert bump', 
  'Visits': 70.0, 
  'Pages/Visit': 1.3142857142857143,
  'Subscribe': 1.0, 
  'Solutions': 0.0, 
  '% New Visits': 0.6428571343421936, 
  'Goal Conversion Rate': 1.0}]

This function goes the extra step and converts the CSV into a dictionary for easier programmatic use. By default, all entries will be returned as python strings. Setting convert_numbers to True, as we did here, will additionally parse the dictionary to turn all numbers into float values.

In [9]: print connector.list_reports()
('ReferringSourcesReport', 
 'SearchEnginesReport', 
 'AllSourcesReport', 
 'KeywordsReport', 
 'CampaignsReport', 
 'AdVersionsReport', 
 'TopContentReport', 
 'ContentByTitleReport',
 'ContentDrilldownReport', 
 'EntrancesReport', 
 'ExitsReport', 
 'GeoMapReport', 
 'LanguagesReport', 
 'HostnamesReport', 
 'SpeedsReport')

This gets a list of all the reports that I have successfully tested thus far. All site-specific reports should work. A couple site-section specific reports should be included in the next update of pyGAPI.

Google is great and will release a real API soon, but until then you can download pyGAPI.

Topics:
, , , ,
  • sandro turriate

    the api looks super friendly yet powerful, I’m so glad someone finally made these reports available programatically, awesome stuff man!

  • Son Nguyen

    I wonder if this violate Google Analytics’ TOS and how long before Google changes something that things break apart.

  • Sal

    There is certainly a risk that something could/would break. Google, however, is a company that takes the high road in terms of programming and in doing what is best for the web. The code behind the Google Analytics website is very elegant, while pyGAPI does not do screen scraping for any of the real work. The data is pulled through the data exporting system. I would say that it is unlikely that the API would break without a major overhaul of the entire GA system.

    I can’t specifically speak towards the TOS, but pyGAPI is doing the equivalent work of an underpaid temp who simply logs in and downloads all the requested reports. The poor temp is just getting a break. Read the TOS and use pyGAPI at your own risk.

  • Chris Gemignani

    Son Nguyen,

    It is far more likely that Google will provide a supported API that would supercede this. That would be the Googley thing to do.

    Similar APIs have been produced around Gmail without interference. If things break, we, the community will fix it.

  • Tom

    Using the example above shouldn’t it be connector.list_sites() to get a list of all the site id’s. Also for me this only returning the first site.

    The report list, connector.report_list(), seems not to be comprehensive here’s a better one:

    Google Analytics Reports

    -Visitors-
    VisitorsOverviewReport
    GeoMapReport
    VisitorTypesReport
    LanguagesReport

    -Visitor Trending-
    VisitsReport
    UniqueVisitorsReport
    PageviewsReport
    AveragePageviewsReport
    TimeOnSiteReport
    BounceRateReport

    -Visitor Loyalty-
    LoyaltyReport
    RecencyReport
    LengthOfVisitReport
    DepthOfVisitReport

    -Browser Capabilities-
    BrowsersReport
    PlatformsReport
    OsBrowsersReport
    ColorsReport
    ResolutionsReport
    FlashReport
    JavaReport

    -Network Properties-
    NetworksReport
    HostnamesReport
    SpeedsReport
    UserDefinedReport

    -Traffic Sources-
    TrafficSourcesReport
    DirectSourcesReport
    ReferringSourcesReport
    SearchEnginesReport
    AllSourcesReport
    KeywordsReport

    -Adwords-
    AdwordsReport
    KeywordPositionReport
    OfflineAudioReport

    CampaignsReport

    -Content-
    ContentReport
    TopContentReport
    ContentByTitleReport
    ContentDrilldownReport
    EntrancesReport
    ExitsReport

  • Sal

    Thanks for catching the errors Tom!
    You are correct on all three counts.

    I fixed the upload so that it correctly displays list_sites() if you have more than one site in your list, and i fixed the typo here in the blog.

    I’ll peek through the list of reports to make that more exhaustive as well.

  • Matt Webb

    This is awesome work. Do you think this python script could work in conjunction with superkaramba on Linux?

  • Rodrigo

    This is great. I put this together with a Samurize desktop to display Analytics data on my desktop.
    Thanks!

  • Ludovic

    Very nice work. Very useful to, let’s say get your most visited pages without having to maintain parallel accounting. May I ask you to licence it to an OSS licence and put it on Google Code ? Would be great.

  • Sebastian

    Hello,

    it work well! Great.
    How can i pull the “keyword” or “country” report for a specific URL?
    (use segmention)

    Thanks

  • Thierry

    Awesome work !

  • Random
  • Sal

    I wrote a Python API wrapper that I call ‘degapi’ for the new analytics API to replace this old code. I have yet to put up a post and link about it, but it can be found here: http://suryasev.github.com/python-degapi/

    There is an automatic python code generator for this API at http://vascodegapi.juiceanalytics.com