Visit our new documentation site! This documentation page is no longer updated.

Writing Job Results into TD Table

You can use the Treasure Data Connector to write job results to a Treasure Data database. Usually, this connector is used to export data from one Treasure Data account to another Treasure Data account.

Untitled-3
We strongly recommend that you not use this connector to write data from one table in your account to another table in your account.
Instead, use "INSERT INTO" query or "INSERT OVERWRITE/CREATE TABLE AS SELECT" query for the best performance.
Presto: "CREATE TABLE AS SELECT" or "INSERT INTO"
Hive: "INSERT INTO/OVERWRITE"

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data, including the toolbelt.

Basic Usage

You can configure a connection from the Treasure Data Console or from the command line interface.

How to complete an account-to-account transfer using the Treasure Data Console

Begin from the target Treasure Data account. You must copy the API Key information from the account profile.

Retrieve the API Key information

From the console, click the Profile icon of the administrator user.



Go to the Credentials tab.

Copy the Write-only key.

If Master key and Write-only key are not listed, click Create to verify the account. When verification completes and you return to the profile, and you see both a master key and write-only key. Copy the write-only key.

Create the Connection

Go to Connections –> Sources Catalog and click the Treasure Data tile.



Enter the write-only API Key from the target account.



In API Hostname enter the server to use for export. You can leave the field blank to use the default server for your region. Alternatively, you can enter one of the following values:

  • AWS US
  • AWS JP
  • IDCF

Click Continue.

Give your connection a descriptive name and select Create Connection.

For On-demand Jobs from the CLI

For on-demand jobs, just add the --result option to the td query command. When the job is finished, the results are written into output_table within the testdb database.

$ td query --result 'td://@/testdb/output_table' \
  -w -d testdb \
  "SELECT code, COUNT(1) FROM www_access GROUP BY code"      

For Scheduled Jobs

For scheduled jobs, just add the --result option when scheduling a job. Every time the job runs, the results are written into output_table.

$ td result:create mydb td://@/testdb
$ td sched:create hourly_count_example "0 * * * *" \
  -d testdb "select count(*) from www_access" \
  --result mydb:output_table

Two Ways to Modify Data (Append/Replace)

You can add or delete data in two ways.

td://@/testdb/output_table              # append
td://@/testdb/output_table?mode=append  # append
td://@/testdb/output_table?mode=replace # replace

APPEND

This is the default mode. The query results are appended to the table. If the table does not exist yet, a new table will be created. This operation is not idempotent; you cannot make that same call repeatedly and produce the same result.

REPLACE

If the table already exists, the rows of the existing table are replaced with the query results. If the table does not exist yet, a new table will be created.

We achieve atomicity (so that a consumer of the table always has consistent data) by performing the following three steps in a single transaction.

  1. Create a temporary table.
  2. Write to the temporary table.
  3. Replace the existing table with the temporary table atomically.

Last modified: Mar 30 2018 01:47:42 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.