Google Bigquery インポート連携 CLI
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude
Cursorに接続
CursorにMCPサーバーをインストール
VS Codeに接続
VS CodeにMCPサーバーをインストール

お好みに応じて、TD Toolbelt を使用してコネクタを利用できます。

CLI 上で TD Toolbelt をセットアップします。

設定ファイルの作成

ここでは「config.yml」として参照される設定 YAML ファイルを作成します。

例 (config.yml)

in:
  type: bigquery
  project_id: my-project
  auth_method: json_key
  json_keyfile:
    content: |
      {
        "type": "service_account",
        "project_id": "xxxxxx",
        ...
       }
  import_type: table
  dataset: my_dataset
  table: my_table
  incremental: true
  incremental_columns: [id]
  export_to_gcs: true
  temp_dataset: temp
  temp_table: temp_table
  gcs_bucket: my-bucket
  gcs_path_prefix: data-connector/result-
out:
  type: td

GCP の認証

JSON キー

「auth_method: json_key」を指定し、サービスアカウントキーの JSON コンテンツを「json_keyfile**.**content」に入力します。

auth_method: json_key
json_keyfile:
  content: |
    {
      "type": "service_account",
      "project_id": "xxxxxx",
      ...
     }

OAuth

OAuth 2 アプリケーションで認証されたアカウントを使用する場合は、「auth_method: oauth2」、「client_id」、「client_secret」、「refresh_token」を指定します。

auth_method: oauth2
client_id: 000000000000-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com
client_secret: yyyyyyyyyyyyyyyyyyyyyyyy
refresh_token: zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz

インポートタイプ

テーブル読み込み

テーブル読み込みでは、「import_type: table」、「dataset」、「table」を指定します。

`import_type: table` `dataset: my_dataset` `table: my_table`

クエリ読み込み

クエリ読み込みでは、「import_type: query」と「query」を指定します。

import_type: query
query: |-
  SELECT
    id, first_name, last_name, created_at
  FROM
    my_dataset.my_table
  WHERE first_name = "Treasure"

オプションで「query_option」を指定できます。「use_leagacy_sql」はデフォルトで false であり、「use_query_cache」はデフォルトで true です。

query: SELECT ...
query_option:
  use_legacy_sql: false
  use_query_cache: true

データロケーション

必要に応じて「location」でロケーションを指定できます。

location: asia-northeast1

増分読み込み

有効にするには、「incremental: true」と「incremental_columns」を指定します。

incremental: true
incremental_columns: [id]

大規模データセットのインポート

有効にするには、「export_to_gcs: true」を指定し、「temp_dataset」、「temp_table」、「gcs_bucket」、「gcs_path_prefix」を追加します。

export_to_gcs: true
temp_dataset: temp
temp_table: temp_table
gcs_bucket: my-bucket
gcs_path_prefix: data-connector/result-

(オプション) プレビュー

td connector:preview コマンドを実行して、設定ファイルを検証します。

$ td connector:preview config.yml

新規コネクタセッションの作成

td connector:create を実行します。

以下の例では、BigQuery コネクタを使用した日次インポートセッションが作成されます。

$ td connector:create daily_bigquery_import \
    "10 0 * * *" td_sample_db td_sample_table config.yml

データパーティションキー

コネクタセッションでは、結果データ内に少なくとも 1 つのタイムスタンプカラムがデータパーティションキーとして使用される必要があり、デフォルトでは最初のタイムスタンプカラムがキーとして選択されます。特定のカラムを明示的に指定する場合は、「--time-column」オプションを使用します。

$ td connector:create --time-column created_at \
    daily_bigquery_import ...

結果データにタイムスタンプカラムがない場合は、次のようにフィルタ設定を追加して「time」カラムを追加します。

in:
  type: bigquery
  ...
filters:
- type: add_time
  from_value:
    mode: upload_time
  to_column:
    name: time
out:
  type: td