If you prefer, you can use the connector via TD Toolbelt.
Set up the TD Toolbelt on the CLI.
Create configuration YAML file that is referred to as "config.yml" here.
in:
type: bigquery
project_id: my-project
auth_method: json_key
json_keyfile:
content: |
{
"type": "service_account",
"project_id": "xxxxxx",
...
}
import_type: table
dataset: my_dataset
table: my_table
incremental: true
incremental_columns: [id]
export_to_gcs: true
temp_dataset: temp
temp_table: temp_table
gcs_bucket: my-bucket
gcs_path_prefix: data-connector/result-
out:
type: tdSpecify "auth_method: json_key" and put a JSON content of your service account key into "json_keyfile**.**content"
auth_method: json_key
json_keyfile:
content: |
{
"type": "service_account",
"project_id": "xxxxxx",
...
} If you want to use authorized account by your OAuth 2 application, specify "auth_method: oauth2", "client_id", "client_secret" and "refresh_token"
auth_method: oauth2
client_id: 000000000000-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com
client_secret: yyyyyyyyyyyyyyyyyyyyyyyy
refresh_token: zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzWith table loading, specify "import_type: table", "dataset" and "table"
With query loading, specify "import_type: query" and "query"
import_type: query
query: |-
SELECT
id, first_name, last_name, created_at
FROM
my_dataset.my_table
WHERE first_name = "Treasure"You can optionally specify "query_option". "use_leagacy_sql" is false by default and "use_query_cache" is true by default.
query: SELECT ...
query_option:
use_legacy_sql: false
use_query_cache: true You can specify the location by "location" if necessary
location: asia-northeast1To enable it, specify "incremental: true" and "incremental_columns"
incremental: true
incremental_columns: [id]To enable it, specify "export_to_gcs: true", then add "temp_dataset", "temp_table", "gcs_bucket" and "gcs_path_prefix"
export_to_gcs: true
temp_dataset: temp
temp_table: temp_table
gcs_bucket: my-bucket
gcs_path_prefix: data-connector/result-Run td connector:preview command to validate your configuration file
$ td connector:preview config.ymlRun td connector:create.
By the following example, a daily import session with BigQuery connector is created.
$ td connector:create daily_bigquery_import \
"10 0 * * *" td_sample_db td_sample_table config.ymlConnector sessions need at least one timestamp column in result data to be used as data partition key and the first timestamp column is chosen as the key by default. Use "--time-column" option if you want to explicitly specify a column.
$ td connector:create --time-column created_at \
daily_bigquery_import ...If your result data doesn't have any timestamp column, add the "time" column by adding the filter configuration as follows.
in:
type: bigquery
...
filters:
- type: add_time
from_value:
mode: upload_time
to_column:
name: time
out:
type: td