Learn more about Google BigQuery Import Integration.
You can write job results directly to your Google BigQuery. You can view how to import data from Google BigQuery.
For sample workflows on how to write job results to Google BigQuery, view Treasure Boxes.
Prerequisites
Basic knowledge of Treasure Data, including the TD Toolbelt.
A Google Cloud Platform account
Requirements and Limitations
- Nested or repeated data types like ARRAY are not supported as destination columns.
Supported
Treasure Data supports the "append, replace, replace backup, delete" modes.
Obtaining your Google Cloud Platform Credentials
To use this feature, you need:
Project ID
JSON Credential
Obtaining the JSON Credential
The integration with Google BigQuery is based on server-to-server API authentication.
Navigate to your Google Developer Console.
Select APIs & auth > Credentials.
Select Service account.
Select the JSON based key type, which is recommended by Google. The key is downloaded automatically by the browser.
Obtaining the Project ID
Navigate to your Google Developer Console.
Select Home.
Locate your Project ID.
Create a Dataset and Table on BigQuery
Create your Dataset and Table from your BigQuery console.
Use from TD Console
Navigate to TD Console.
Navigate to Integrations Hub > Catalog.
Select Google Big Query.
Complete all the information as follows:
Write the query. The query results must be matched with the pre-defined schema on BigQuery. After you execute the job, Treasure Data query results are automatically imported into Google BigQuery.
Append Mode
When the table already exists in BigQuery, you see a pane similar to the following:
Fill in Project ID, Dataset Name and Table Name get from BigQuery.
Write the query. The query results must match the pre-defined schema on BigQuery. After you execute the job, Treasure Data query results are automatically imported into Google BigQuery.
When a table does not exist in BigQuery:
Fill in Project ID, Dataset Name get from BigQuery.
Provide a name for the table that will be created in BigQuery Dataset. Check Auto-create table, provide a schema for the new table. Schema JSON must be matched with the query results(number of field and data type). After you execute the job, a new table with name and schema as your input will be created and Treasure Data query results are automatically imported into this table Google BigQuery.
Example Schema JSON:
[{"name":"id","type":"INTEGER","mode": "REQUIRED"},{"name":"name","type":"STRING"}]
Replace Mode
You see a pane similar to the following:
Fill in Project ID, Dataset Name and Table Name get from BigQuery.
Select Auto-create table and provide a schema for the table to be replaced. The value in the Schema JSON field must match the query results (number of fields and data type). Choose Replace from drop down. After you execute the job, if the table already exists in BigQuery, or if the value in Schema JSON is different from table schema, then table is handled as replaced data. If the table does not exist in BigQuery then a new table is created with name and schema as your input. Treasure Data query results are automatically imported into this table.
Replace Backup Mode
You see a pane similar to the following:
Replace backup mode is similar to replace mode, but in replace backup mode, both data and schema are backed up into a table that is assigned the old name plus "_old" as the suffix.
Delete Mode
You see a pane similar to the following:
Fill in Project ID, Dataset Name and Table Name get from BigQuery.
Select Auto-create table and provide a schema for the table to be replaced. The value in the Schema JSON field must match the query results (number of fields and data type). Choose Delete
from the drop down. If the table already exists in BigQuery, when you execute the job, the table is deleted and a new table is added with the schema as JSON. If table does not exist in BigQuery, then a new table is created with name and schema as your input. Treasure Data query results are automatically imported into this newly created table.
About Data Conversion
BigQuery's data types are automatically converted to a corresponding Treasure Data type, as indicated in the following table. If you include unsupported types in the schema JSON or if query result miss-match the data type with schema JSON, you receive errors.
Treasure Data | BigQuery |
---|---|
string | STRING |
long | INTEGER |
double | FLOAT |
long (true is 1, false is 0) | BOOLEAN |
string (yyyy-MM-dd HH:mm:ss.SSS) | TIMESTAMP |
Scheduling from the CLI
The following command allows you to set a scheduled query with Results Output to BigQuery.
Designate your json_key and escape newline with backslash. |
For example,
$ td sched:create scheduled_bigquery "10 6 14 12 *" \ -d dataconnector_db "SELECT id,account,purchase,comment,time FROM data_connectors" \ -r '{"type":"bigquery","project":"YOUR_PROJECT","dataset":"YOUR_DB","table":"YOUR_TABLE","auto_create_table":true,"max_bad_records":0,"ignore_unknown_values":true,"allow_quoted_newlines":true,"schema_file":"[{\"name\": \"id\", \"type\": \"INTEGER\"}, {\"name\": \"account\", \"type\": \"STRING\"},{\"name\": \"purchase\", \"type\": \"STRING\"}, {\"name\": \"comment\", \"type\": \"STRING\", \"mode\": \"REQUIRED\"}, {\"name\": \"time\", \"type\": \"TIMESTAMP\", \"mode\": \"REQUIRED\"}]", "json_keyfile":"{\"private_key_id\": \"ABDE\", \"private_key\": \"-----BEGIN PRIVATE KEY-----\\nABCDE\\nABCDE\\nABCDE\\n-----END PRIVATE KEY-----\\n\", \"client_email\": \"ABCDE.gserviceaccount.com\", \"client_id\": \"ABCDE.apps.googleusercontent.com\", \"type\": \"service_account\"}"}'