Google Analytics is a web analytics service offered by Google that tracks and reports website traffic, as a platform inside Google Marketing.  The Google Analytics import integration enables the import of your Google Analytics reports.

Sample workflows are available for many integrations through Treasure Boxes on Github.


Prerequisites

  • Basic knowledge of Treasure Data

  • Basic knowledge of Google Analysis

  • A Google Analytics account with dimensions and metrics specified

Requirements

To fetch report data on Google Analytics, your service account requires “Read & Analyze” or a higher level of permission.

About Partition Key Seed

Typically, TD partitions data by time. Choose the long or timestamp column as the partitioning time. By default, the Partition key seed uses the time column, specifically the upload_time column through the add_time filter. In Google Analytics, the partition key seed becomes “ga: date” or “date_hour” specified in the time_series section.

Google API Set-Up Options

As part of creating your data connector, you register Treasure Data using the Google API Console.

The method you use to authenticate Treasure Data with Google Analytics affects the steps you take to enable the data connector to import from Google Analytics.

Authentication Method
Google User Account: OAuthUsing OAuth is the most common method. This method requires fewer setup steps. You can skip the rest of this section and go directly to TD Console.

Google Service Account—JSON

Using JSON might be required for your implementation.

This method requires the setup steps using the Google API.

Set the Google API for JSON Authentication

The Google Analytics data connector uses an API connector to access Google Analytics data.

1. Open the Google API Console: https://console.developers.google.com/
2. Log in to the Google Analytics account that you want to access through the API.

You log into the Google Analytics account so that you have permission to access API services.

3. Navigate to the Library.

4. Select Create project.

5. Select the organization.

6. Select New Project.

7. Name your project.

8. Select Create.

Create a New Service Account

9.  Access Service Account to create a new account. The service account permits access by the API to the Google Analytics account. The service account is created within a project and is from the Google account that you logged into to access Google API services.
10. Select APIs & Services.

11. Select Service Accounts.
12. Select Create Service Account.

13. Complete the fields to create a service account named “treasure-data”. 

14. Skip the optional parameters.
15. Select Done.
16. Select the Actions menu.

17. Select Manage keys.

18. Select Add key

19. Select Create new key.


20. Select Create.

The JSON file downloads to the machine you are on. 

Keep the file containing authentication in a safe place. You use the private key information of the JSON file in the data connector configuration file.

For example, the generated file might look similar to:

{
  "type": "service_account",
  "project_id": "central-stream-314923",
  "private_key_id": "94d03bf7dd9c05bc122c695d1aa13f2a8a28f88e",
  "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvQIBADANBgkqhkiG9w**********************AoIBAQCNhAICLr/dozCQ\nTW9ZNMNJ6RF+fVqhd0FUbw0VBIwy6BWu/LuaocJrzl2DHChAl0PNvGCDUAObBTRz\nbUT/HOu47q**********7ENGK\nOir9VChG+Qubq25bAtOq/yTVEPJgnj*******AGVjojVnK4f\n2YtW6ti7xBPwFBF1RPY56yTDeVQVko+KK3x+LFS+lTj1+jBBjvedWHrpQQfRHqV/\nVtXyKyDybQlnfAOucMHzMxjQVLN4f9D7JVxCe52Wp7RaALCIdkKDqN/ffkNMF9QT\nCjffudeTAgMBAAECggEAFMQnS0yy6QI2cSZ7zXpZofHqmEYq04DdfFdjcw8cx6eY\n7vm1Seas0gcRX9j06y2HTJx1CS/np4rm/H0vX8RNrvCPYXrOJzUG2DOnW9pwi9Hl\nKb1Z0VErenzy/em78BI958fXIJ4vv5pjNUZ94njEBE4tbuWEJyTODMyuCfoXpye4\nkCDY6DJFxDKUA7tZOTcK3t0YiVV0O2MwcUhJdr107kw4F1HXY/mlh87ki5z3tMy0\nISBKjvau2aWf0SVLZHtlo88JZGUak7tkuxnWaXQN+dUo1rZWKj867pBT4KWXzAbJ\nUVQ7pBrDFri90fNQ5XFsQdS//dO2pFEn+1Aum86Q0QKBgQDB4RHjBWdJ3eMyvWWi\nipdCx4gC6G5Hqjt+icKv9yddyV/WvuMH82xDAHUJJBzaj9I45O5D+07O6TZO8CkZ\n6Tqq92N3HEkHZWiUTo91C4qbO4ai5SXxpnWn5gsYc+JYPqNp1b+T1gZjA4Pj8l+t\neJ7VDGxu0tjK17Vj13turImXCQKBgQC629KRpvq9FAIWuA8NAXBSeqNyzktPVdOZ\n5GJvwCevVzIapvwZPoZTaJ6xehta1hrR859ZReZx/j7ntoOjAjGw1rS//T5N98Hf\nt+JpCemAa5ApcoUBAXmlb80jIHysRBgMUTLcTKnZuFT3RwsD1xtXjRct0doIF8EC\nd8RLE7FkuwKBgQCJSuGIuwXWqBtAjiBPxwawUm29aWzWsPTqeZF1XHbzEiwc/RX2\nRmmu1L8MFxebqmb6xRr45xh6q2k64xSn9aIG+aLk8RHB/AzfoPYzs1WW8cM4zT5e\nbjs5B01qJn3tcYX051l/zfq92Ppny/X2+Mi5I9ARdpvwoGoh5rDQwbu5SQKBgHx8\naGtKsC75Pm7+TmCevcLlGzEoCHohNqiGw6GphYbF84ZYCwmSYxD8WQTp0YGRtCp9\nQILME7uL40KhkE8v7gTe9WoWf8SXs5ykt/y8cshwYImMVtmVrwItWp/1S7nEX7UM\n/3JOzLVUnZ5jwQ3c58VLJM8MyFGt6ZMIUUinJP5zAoGAICmOlDqPWR2RXPo+9SkN\nok82AjvsjeUMDsiCkEVsAQMBZkYbND0047BAg7STqVjIaJg0zYFvQ5oow5zgu1lk\n46nxtfQm3U58lILErGsmClxcOZR2nO7kvm0PJMUgENADGhP5pqE+8w+e4JC45Ojw\nX7X+hhL/a7pu2Un9O/rXZVM=\n-----END PRIVATE KEY-----\n",
  "client_email": "meg-td-service-account@central-stream-314923.iam.gserviceaccount.com",
  "client_id": "117460147437348814027",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/meg-td-service-account%40central-stream-314923.iam.gserviceaccount.com"
}


21. Navigate back to Service accounts and locate your key ID.

22. Select your service account.

23. Expand Show Domain Wide Delegation.
24. Select Enable Google Workspace Domain-wide Delegation.
25. Type Treasure Data as the product name.

26. Select Save.

The service account that you created also creates a service account ID. The service account ID is given API access authentication.



Set the Analytics API for JSON


You now have a project and account ID in Google API. Next, you enable specific APIs to be used in this project. In this step, you register the two APIs related to Google Analytics.

Locate and Enable the Analytics APIs

1. In Google Cloud Platform, navigate to APIs & Services > Library.

2. Use the search bar and find:

Analytics API Reporting

3. Select Google Analytics Reporting API.
4. Select Enable.

When you enable the API, from the Dashboard menu, you can see a list of all APIs that have been registered, and the monitoring of the API starts.

Two APIs in your project in Google API are enabled to send data. The service account ID is allowed access to Google Analytics data. 


Associate the Google API and Service Account with Google Analytics for JSON

Add Permission for Service Account ID to Access Google APIs for JSON

Add permissions for your service account that you created in Google API.

To fetch report data on Google Analytics, your service account requires “Read & Analyze” or a higher level of permission.

1. Verify that you are still logged into Google Analytics with your Google account. At:
     https://analytics.google.com/
2. Select  Admin.
3. Select Account User Management.
4.  Define “Read & Analyze” or a higher level of permission on your service account.


Your service account now has adequate access to use Google Analytics through the Google Analytics APIs.


Obtain the View ID from Google Analytics

You must have the view ID to create the authentication to Treasure Data.

Each unique view of data has an associated View ID. You must know the View ID of the data that you want to access.


1. Navigate to the Home page of Google Analytics. For example:

analytics.google.com/

2. Select Admin.
3. Select View Settings. For example:


4. Locate the View ID field on the page.


5. Capture or copy the View ID, it is necessary for the creation of the Treasure Data authentication.


Creating the Data Connector from the TD Console

Create a New Connection

In Treasure Data, you must create and configure the data connection prior to running your query. As part of the data connection, you provide authentication to access the integration.

1. Open TD Console.
2. Navigate to Integrations Hub  Catalog.
3. Search for and select Google Analytics.

4. Select Create Authentication.

5. Choose one of the following authentication methods:

Enter the View ID and JSON key information.

Make sure that you include the private key of the service account. Make sure that the entire JSON key information is in brackets {…}.

  1. Locate and open the JSON file that you downloaded from the Google Cloud Platform in your favorite text editor.  For example:
  2. Type your View ID. For example:
    179999562
  3. Select Continue.


  1. Select OAuth.
  2. If you know the OAuth connection type it, otherwise you can make the selection to connect a new account.
  3. If you chose to connect a new account, you are taken through a series of screens where you select the account to link and specify that you are OK granting access to that account by Treasure Data. 
    You are then returned to  Authentications in TD Console.
  4. Search for and select Google Analytics.

  5. Select OAuth. 
    Your account should show in the Authentication connection field. For example:

  6. Type your View ID. For example:
    179999562
  7. Select Continue.


6. Enter a name for your connection.
7. Select Done.



Transfer Your Data to Treasure Data


After creating the authenticated connection, you are automatically taken to Authentications.

You must enter Dimension and Metric information from Google Analytics.


1. Search for the connection you created. 
2. Select New Source.
3. Type a name for your Source in the Data Transfer field.
4. Select Next.

The Source Table dialog opens.  You must enter Dimension and Metric information. You go to Google Analytics to obtain the information and enter the information in the Treasure Data Transfer data from Google Analytics dialog.

5. Edit the following parameters:
Parameters Description
Time Series

For the Time Series field indicate whether you want to track the hour with the date or track just the date. 

Dimensions

Dimensions are data categories. Dimension values (the data contained by the dimension) are names, descriptions, or other characteristics of a category.

For example:

  • ga:pagePath
  • ga:referralPath
Metrics

Metrics measure the things contained in dimensions and provide the numeric scale and data series for the chart.

For example:

  • ga:pageviews
  • ga:sessions
  • ga:users
Incremental

When run repeatedly, attempt to only import new data since the last import

6. Select Next.

The Data Settings page can be modified for your needs or you can skip the page.

7. Optionally, edit the parameters on the Data Settings page.
8. Select Next.


Data Preview 


You can see a preview of your data before running the import by selecting Generate Preview.

Data shown in the data preview is approximated from your source. It is not the actual data that is imported.

  1. Click Next.
    Data preview is optional and you can safely skip to the next page of the dialog if you want.

  2. To preview your data, select Generate Preview. Optionally, click Next

  3. Verify that the data looks approximately like you expect it to.


  4. Select Next.


Data Placement


For data placement, select the target database and table where you want your data placed and indicate how often the import should run.

  1.  Select Next. Under Storage you will create a new or select an existing database and create a new or select an existing table for where you want to place the imported data.

  2. Select a Database > Select an existing or Create New Database.

  3. Optionally, type a database name.

  4. Select a Table> Select an existing or Create New Table.

  5. Optionally, type a table name.

  6. Choose the method for importing the data.

    • Append (default)-Data import results are appended to the table.
      If the table does not exist, it will be created.

    • Always Replace-Replaces the entire content of an existing table with the result output of the query. If the table does not exist, a new table is created. 

    • Replace on New Data-Only replace the entire content of an existing table with the result output when there is new data.

  7. Select the Timestamp-based Partition Key column.
    If you want to set a different partition key seed than the default key, you can specify the long or timestamp column as the partitioning time. As a default time column, it uses upload_time with the add_time filter.

  8. Select the Timezone for your data storage.

  9. Under Schedule, you can choose when and how often you want to run this query.

    • Run once:
      1. Select Off.

      2. Select Scheduling Timezone.

      3. Select Create & Run Now.

    • Repeat the query:

      1. Select On.

      2. Select the Schedule. The UI provides these four options: @hourly, @daily and @monthly or custom cron.

      3. You can also select Delay Transfer and add a delay of execution time.

      4. Select Scheduling Timezone.

      5. Select Create & Run Now.

 After your transfer has run, you can see the results of your transfer in Data Workbench > Databases.


Optionally Configure Workflow

Within Treasure Workflow, you can specify the use of this data connector as part of a workflow.

Learn more at Using Workflows to Export Data with the TD Toolbelt.



  • No labels