Skip to content

TD Toolbelt Reference

You can run Treasure Data from the command line using these commands.

CommandExample
Basic Commandstd
Database Commandstd db:create <db>
Table Commandstd table:list [db]
Query Commandstd query [sql]
Import Commandstd import:list
Bulk Import Commandstd bulk_import:list
Result Commandstd result:list
Schedule Commandstd sched:list
Schema Commandstd schema:show <db> <table>
Connector Commandstd connector:guess [config]
User Commandstd user:list
Workflow Commandstd workflow init
Job Commandstd job:show <job_id>

Basic Commands

You can use the following commands to enable basic functions in Treasure Data.

td

Show list of options in Treasure Data.

Usage

td
OptionsDescription
-c, --config PATHPath to the configuration file (default: ~/.td/td.conf)
-k, --apikey KEYUse this API key instead of reading the config file
-e, --endpoint API_SERVERSpecify the URL for API server to use (default: https://api.treasuredata.com). The URL must contain a scheme (http:// or https:// prefix) to be valid.
--insecureInsecure access: disable SSL/TLS verification. Insecure mode is disabled by default.
-v, --verboseVerbose mode
-r, --retry-post-requestsRetry on failed post requests. Warning: can cause resource duplication, such as duplicated job submissions.
--versionShow version

Additional Commands

Usage

td <command>
OptionsDescription
dbcreate/delete/list databases
tablecreate/delete/list/import/export/tail tables
queryissue a query
jobshow/kill/list jobs
importmanage bulk import sessions (Java based fast processing)
bulk_importmanage bulk import sessions (Old Ruby-based implementation)
resultcreate/delete/list result URLs
schedcreate/delete/list schedules that run a query periodically
schemacreate/delete/modify schemas of tables
connectormanage connectors
workflowmanage workflows
statusshow scheds, jobs, tables and results
apikeyshow/set API key
servershow status of the Treasure Data server
samplecreate a sample log file
helpshow help messages

Database Commands

You can create, delete, and view lists of databases from the command line.

td db create

Create a database.

Usage

td db:create <db>

Example

td db:create example_db

td db delete

Delete a database.

Usage

td db:delete <db>
OptionsDescription
-f, --forceclear tables and delete the database

Example

td db:delete example_db

td db list

Show list of tables in a database.

Usage

td db:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td db:list
td dbs

Table Commands

You can create, list, show, and organize table structure using the command line.

td table list

Show list of tables.

Usage

td table:list [db]
OptionsDescription
-n, --num_threads VALnumber of threads to get list in parallel
--show-bytesshow estimated table size in bytes
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td table:list
td table:list example_db
td tables

td table show

Describe information in a table.

Usage

td table:show <db> <table>
OptionsDescription
-vshow more attributes

Example

td table example_db table1

td table create

Create a table.

Usage

td table:create <db> <table>
OptionsDescription
-T, --type TYPE --expire-days DAYS --include-v BOOLEAN --detect-schema BOOLEANset table type (log) set table expire days set include_v flag set detect schema flag

Example

td table:create example_db table1

td table delete

Delete a table.

Usage

td table:delete <db> <table>
OptionsDescription
-f, --forcenever prompt

Example

td table:delete example_db table1

td table import

Parse and import files to a table

Usage

td table:import <db> <table> <files...>
OptionsDescription
--format FORMATfile format (default: apache)
--apachesame as --format apache; apache common log format
--syslogsame as --format syslog; syslog
--msgpacksame as --format msgpack; msgpack stream format
--jsonsame as --format json; LF-separated json format
-t, --time-key COL_NAMEtime key name for json and msgpack format (e.g. 'created_at')
--auto-create-tableCreate table and database if doesn't exist

Example

td table:import example_db table1 --apache access.log
td table:import example_db table1 --json -t time - < test.json

How is the import command’s time format set in a windows batch file?

% is a recognized environment variable, so you must use ‘%%’ to set it.

td import:prepare --format csv --column-header \
--time-column 'date' --time-format '%%Y-%%m-%%d' test.csv

td table export

Dump logs in a table to the specified storage

Usage

td table:export <db> <table>
OptionsDescription
-w, --waitwait until the job is completed
-f, --from TIMEexport data which is newer than or same with the TIME
-t, --to TIMEexport data which is older than the TIME
-b, --s3-bucket NAMEname of the destination S3 bucket (required)
-p, --prefix PATHpath prefix of the file on S3
-k, --aws-key-id KEY_IDAWS access key id to export data (required)
-s, --aws-secret-key SECRET_KEYAWS secret access key to export data (required)
-F, --file-format FILE_FORMATfile format for exported data. Available formats are tsv.gz (tab-separated values per line) and jsonl.gz (JSON record per line). The json.gz and line-json.gz formats are default and still available but only for backward compatibility purpose;use is discouraged because they have far lower performance.
-O, --pool-name NAMEspecify resource pool by name
-e, --encryption ENCRYPT_METHODexport with server side encryption with the ENCRYPT_METHOD
-a ASSUME_ROLE_ARN, --assume-roleexport with assume role with ASSUME_ROLE_ARN as role arn

Example

td table:export example_db table1 \
--s3-bucket mybucket -k KEY_ID -s SECRET_KEY

td table swap

Swap the names of two tables.

Usage

td table:swap <db> <table1> <table2>

Example

td table:swap example_db table1 table2

td table rename

Rename the existing table.

Usage

td table:rename <db> <from_table> <dest_table>
OptionsDescription
--overwritereplace existing dest table

Example

td table:rename example_db table1 table2

td table tail

Get recently imported logs.

Usage

td table:tail <db> <table>
OptionsDescription
-n, --count Nnumber of logs to get
-P, --prettypretty print

Example

td table:tail example_db table1
td table:tail example_db table1 -n 30

td table expire

Expire data in table after specified number of days. Set to 0 to disable the expiration.

Usage

td table:expire <db> <table> <expire_days>

Example

td table:expire example_db table1 30

Query Commands

You can issue queries from the command line.

td query

Issue a query

Usage

td query [sql]
OptionsDescription
-d, --database DB_NAMEuse the database (required)
-w, --wait[=SECONDS]wait for finishing the job (for seconds)
-G, --verticaluse vertical table to show results
-o, --output PATHwrite result to the file
-f, --format FORMATformat of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz)
-r, --result RESULT_URLwrite result to the URL (see also result:create subcommand) It is suggested for this option to be used with the -x / --exclude option to suppress printing of the query result to stdout or -o / --output to dump the query result into a file.
-u, --user NAMEset user name for the result URL
-p, --passwordask password for the result URL
-P, --priority PRIORITYset priority
-R, --retry COUNTautomatic retrying count
-q, --query PATHuse file instead of inline query
-T, --type TYPEset query type (hive, trino(presto))
--sampling DENOMINATOROBSOLETE - enable random sampling to reduce records 1/DENOMINATOR
-l, --limit ROWSlimit the number of result rows shown when not outputting to file
-c, --column-headeroutput of the columns' header when the schema is available for the table (only applies to json, tsv and csv formats)
-x, --excludedo not automatically retrieve the job result
-O, --pool-name NAMEspecify resource pool by name
--domain-key DOMAIN_KEYoptional user-provided unique ID. You can include this ID with your create request to ensure idempotence.
--engine-version ENGINE_VERSIONspecify query engine version by name

Example

td query -d example_db -w -r rset1 "select count(*) from table1"
td query -d example_db -w -r rset1 -q query.txt

Import Commands

You can import and organize data from the command line using these commands.

td import list

List bulk import sessions

Usage

td import:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td import:list

td import show

Show list of uploaded parts.

Usage

td import:show <name>

Example

td import:show

td import create

Create a new bulk import session to the table

Usage

td import:create <name> <db> <table>

Example

td import:create logs_201201 example_db event_logs

td import jar version

Show import jar version

Usage

td import:jar_version

Example

td import:jar_version

td import jar update

Update import jar to the latest version

Usage

td import:jar_update

Example

td import:jar_update

td import prepare

Convert files into part file format

Usage

td import:prepare <files...>
OptionsDescription
-f, --format FORMATsource file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv
-C, --compress TYPEcompressed type [gzip, none, auto]; default=auto detect
-T, --time-format FORMATspecifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the'%:z' and '%::z' timezone options are not supported.
-e, --encoding TYPEencoding type [UTF-8, etc.]
-o, --output DIRoutput directory. default directory is 'out'.
-s, --split-size SIZE_IN_KBsize of each parts (default: 16384)
-t, --time-column NAMEname of the time column
--time-value TIME,HOURStime column's value. If the data doesn't have a time column,users can auto-generate the time column's value in 2 ways: Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp >= TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there.
--primary-key NAME:TYPEpair of name and type of primary key declared in your item table
--prepare-parallel NUMprepare in parallel (default: 2; max 96)
--only-columns NAME,NAME,...only columns
--exclude-columns NAME,NAME,...exclude columns
--error-records-handling MODEerror records handling mode [skip, abort]; default=skip
--invalid-columns-handling MODEinvalid columns handling mode [autofix, warn]; default=warn
--error-records-output DIRwrite error records; default directory is 'error-records'.
--columns NAME,NAME,...column names (use --column-header instead if the first line has column names)
--column-types TYPE,TYPE,...column types [string, int, long, double]
--column-type NAME:TYPEcolumn type [string, int, long, double]. A pair of column name and type can be specified like 'age:int'
S, --all-stringdisable automatic type conversion
--empty-as-null-if-numericthe empty string values are interpreted as null values if columns are numerical types.

CSV/TSV Specific Options

OptionsDescription
--column-headerfirst line includes column names
--delimiter CHARdelimiter CHAR; default="," at csv, "\t" at tsv
--escape CHARescape CHAR; default=\
--newline TYPEnewline [CRLF, LF, CR]; default=CRLF
--quote CHARquote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE

MySQL Specific Options

OptionsDescription
--db-url URLJDBC connection URL
--db-user NAMEuser name for MySQL account
--db-password PASSWORDpassword for MySQL account

REGEX Specific Options

OptionsDescription
--regex-pattern PATTERNpattern to parse line. When 'regex' is used as source file format, this option is required

Example

td import:prepare logs/*.csv --format csv \
--columns date_code,uid,price,count --time-value 1394409600,10 -o parts/

td import:prepare mytable --format mysql \
--db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass

td import:prepare "s3://<s3_access_key>:<s3_secret_key>@/my_bucket/path/to/*.csv" \
--format csv --column-header --time-column date_time -o parts/

td import upload

Upload or re-upload files into a bulk import session

Usage

td import:upload <session name> <files...>
OptionsDescription
--retry-count NUMupload process will automatically retry at specified time; default: 10
--auto-create DATABASE.TABLEcreate automatically bulk import session by specified database and table names If you use 'auto-create' option, you MUST not specify any session name as first argument.
--auto-performperform bulk import job automatically
--auto-commitcommit bulk import job automatically
--auto-deletedelete bulk import session automatically
--parallel NUMupload in parallel (default: 2; max 8)
-f, --format FORMATsource file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv
-C, --compress TYPEcompressed type [gzip, none, auto]; default=auto detect
-T, --time-format FORMATspecifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported.
-e, --encoding TYPEencoding type [UTF-8, etc.]
-o, --output DIRoutput directory. default directory is 'out'.
-s, --split-size SIZE_IN_KBsize of each parts (default: 16384)
-t, --time-column NAMEname of the time column
--time-value TIME,HOURStime column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: Fixed time value with -time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with -time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp >= TIME + HOURS 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there.
--primary-key NAME:TYPEpair of name and type of primary key declared in your item table
--prepare-parallel NUMprepare in parallel (default: 2; max 96)
--only-columns NAME,NAME,...only columns
--exclude-columns NAME,NAME,...exclude columns
--error-records-handling MODEerror records handling mode [skip, abort]; default=skip
--invalid-columns-handling MODEinvalid columns handling mode [autofix, warn]; default=warn
--error-records-output DIRwrite error records; default directory is 'error-records'.
--columns NAME,NAME,...column names (use --column-header instead if the first line has column names)
--column-types TYPE,TYPE,...column types [string, int, long, double]
--column-type NAME:TYPEcolumn type [string, int, long, double]. A pair of column name and type can be specified like 'age:int'
-S, --all-stringdisable automatic type conversion
--empty-as-null-if-numericthe empty string values are interpreted as null values if columns are numerical types.

CSV/TSV Specific Options

OptionsDescription
--column-headerfirst line includes column names
--delimiter CHARdelimiter CHAR; default="," at csv, "\t" at tsv
--escape CHARescape CHAR; default=\
--newline TYPEnewline [CRLF, LF, CR]; default=CRLF
--quote CHARquote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE

MySQL Specific Options

OptionsDescription
--db-url URLJDBC connection URL
--db-user NAMEuser name for MySQL account
--db-password PASSWORDpassword for MySQL account

REGEX Specific Options

OptionsDescription
--regex-pattern PATTERNpattern to parse line. When 'regex' is used as source file format, this option is required

Example

td import:upload mysess parts/* --parallel 4

td import:upload mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/

td import:upload parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/

td import:upload mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass

td import:upload "s3://<s3_access_key>:<s3_secret_key>@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/

td import auto

Automatically upload or re-upload files into a bulk import session. It's functional equivalent of 'upload' command with 'auto-perform', 'auto-commit' and 'auto-delete' options. But it, by default, doesn't provide 'auto-create' option. If you want 'auto-create' option, you explicitly must declare it as command options.

Usage

td import:auto <session name> <files...>
OptionsDescription
--retry-count NUMupload process will automatically retry at specified time; default: 10
--auto-create DATABASE.TABLEcreate automatically bulk import session by specified database and table names If you use 'auto-create' option, you MUST not specify any session name as first argument.
--parallel NUMupload in parallel (default: 2; max 8)
-f, --format FORMATsource file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv
-C, --compress TYPEcompressed type [gzip, none, auto]; default=auto detect
-T, --time-format FORMATspecifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported.
-e, --encoding TYPEencoding type [UTF-8, etc.]
-o, --output DIRoutput directory. default directory is 'out'.
-s, --split-size SIZE_IN_KBsize of each parts (default: 16384)
-t, --time-column NAMEname of the time column
--time-value TIME,HOURStime column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp tooverflow the range (timestamp >= TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there.E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there.
--primary-key NAME:TYPEpair of name and type of primary key declared in your item table
--prepare-parallel NUMprepare in parallel (default: 2; max 96)
--only-columns NAME,NAME,...only columns
--exclude-columns NAME,NAME,...exclude columns
--error-records-handling MODEerror records handling mode [skip, abort]; default=skip
--invalid-columns-handling MODEinvalid columns handling mode [autofix, warn]; default=warn
--error-records-output DIRwrite error records; default directory is 'error-records'.
--columns NAME,NAME,...column names (use --column-header instead if the first line has column names)
--column-types TYPE,TYPE,...column types [string, int, long, double]
--column-type NAME:TYPEcolumn type [string, int, long, double]. A pair of column name and type can be specified like 'age:int'
-S, --all-stringdisable automatic type conversion
--empty-as-null-if-numericthe empty string values are interpreted as null values if columns are numerical types.

CSV/TSV Specific Options

OptionsDescription
--column-headerfirst line includes column names
--delimiter CHARdelimiter CHAR; default="," at csv, "\t" at tsv
--escape CHARescape CHAR; default=\
--newline TYPEnewline [CRLF, LF, CR]; default=CRLF
--quote CHARquote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE

MySQL Specific Options

OptionsDescription
--db-url URLJDBC connection URL
--db-user NAMEuser name for MySQL account
--db-password PASSWORDpassword for MySQL account

REGEX Specific Options

OptionsDescription
--regex-pattern PATTERNpattern to parse line. When 'regex' is used as source file format, this option is required

Example

td import:auto mysess parts/* --parallel 4

td import:auto mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/

td import:auto parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/

td import:auto mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass

td import:auto "s3://<s3_access_key>:<s3_secret_key>@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/

td import perform

Start to validate and convert uploaded files

Usage

td import:perform <name>
OptionsDescription
-w, --waitwait for finishing the job
-f, --forceforce start performing
-O, --pool-name NAMEspecify resource pool by name

Example

td import:perform logs_201201

td import error records

Show records which did not pass validations

Usage

td import:error_records <name>

Example

td import:error_records logs_201201

td import commit

Start to commit a performed bulk import session

Usage

td import:commit <name>
OptionsDescription
-w, --waitwait for finishing the commit

Example

td import:commit logs_201201

td import delete

Delete a bulk import session

Usage

td import:delete <name>

Example

td import:delete logs_201201

td import freeze

Pause any further data upload for a bulk import session/Reject succeeding uploadings to a bulk import session

Usage

td import:freeze <name>

Example

td import:freeze logs_201201

td import unfreeze

Unfreeze a bulk import session

Usage

td import:unfreeze <name>

Example

td import:unfreeze logs_201201

td import config

create guess config from arguments Usage

td import:config <files...>
OptionsDescription
-o, --out FILE_NAMEoutput file name for connector:guess
-f, --format FORMATsource file format [csv, tsv, mysql]; default=csv
--db-url URLDatabase Connection URL
--db-user NAMEuser name for database
--db-password PASSWORDpassword for database
--columns COLUMNSnot supported
--column-header COLUMN-HEADERnot supported
--time-column TIME-COLUMNnot supported
--time-format TIME-FORMATnot supported

Example

td import:config "s3://<s3_access_key>:<s3_secret_key>@/my_bucket/path/to/*.csv" -o seed.

Bulk Import Commands

You can create and organize bulk imports from the command line.

For instructions on how to use the bulk import commands, refer to the Bulk Import API Tutorial.

td bulk import list

List bulk import sessions

Usage

td bulk_import:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td bulk_import:list

td bulk import show

Shows a list of uploaded parts

Usage

td bulk_import:show <name>

Example

td bulk_import:show logs_201201

td bulk import create

Creates a new bulk import session to the table

Usage

td bulk_import:create <name> <db> <table>

Example

td bulk import prepare parts

Converts files into part file format

Usage

td bulk_import:prepare_parts <files...>
OptionsDescription
-f, --format NAMEsource file format [csv, tsv, msgpack, json]
-h, --columns NAME,NAME,...column names (use --column-header instead if the first line has column names)
-H, --column-headerfirst line includes column names
-d, --delimiter REGEX --null REGEX --true REGEX --false REGEXdelimiter between columns (default: (?-mix:\t|,)) null expression for the automatic type conversion (default: (?i-mx:\A(?:null||-|\N)\z)) true expression for the automatic type conversion (default: (?i-mx:\A(?:true)\z)) false expression for the automatic type conversion (default: (?i-mx:\A(?:false)\z))
-S, --all-stringdisable automatic type conversion
-t, --time-column NAMEname of the time column
-T, --time-format FORMATstrftime(3) format of the time column
-time-value TIMEvalue of the time column
-e, --encoding NAMEtext encoding
-C, --compress NAMEcompression format name [plain, gzip] (default: auto detect)
-s, --split-size SIZE_IN_KBsize of each parts (default: 16384)
-o, --output DIRoutput directory

Example

td bulk_import:prepare_parts logs/*.csv --format csv \
--columns time,uid,price,count --time-column "time" -o parts/

td bulk import upload parts

Uploads or re-uploads files into a bulk import session

Usage

td bulk_import:upload_parts <name> <files...>
OptionsDescription
-P, --prefix NAMEadd prefix to parts name
-s, --use-suffix COUNT --auto-perform --parallel NUMuse COUNT number of . (dots) in the source file name to the parts name perform bulk import job automatically perform uploading in parallel (default: 2; max 8)
-O, --pool-name NAMEspecify resource pool by name

Example

td bulk import delete parts

Delete uploaded files from a bulk import session

Usage

     td bulk_import:delete_parts <name> <ids...>
OptionsDescription
-P, --prefix NAME add prefix to parts name

Example

td bulk_import:delete_parts logs_201201 01h 02h 03h

td bulk import perform

Start to validate and convert uploaded files

Usage

td bulk_import:perform <name>
OptionsDescription
-w, --wait -f, --force -O, --pool-name NAMEwait for finishing the job force start performing specify resource pool by name

Example

td bulk_import:perform logs_201201

td bulk import error records

Show records which did not pass validations

Usage

td bulk_import:error_records <name>

Example

td bulk_import:error_records logs_201201

td bulk import commit

Start to commit a performed bulk import session

Usage

td bulk_import:commit <name>
OptionsDescription
-w, --waitwait for finishing the commit

Example

td bulk_import:commit logs_201201

td bulk import delete

Delete a bulk import session

Usage

td bulk_import:delete <name>

Example

td bulk_import:delete logs_201201

td bulk import freeze

Block the upload to a bulk import session

Usage

td bulk_import:freeze <name>

Example

td bulk_import:freeze logs_201201

td bulk import unfreeze

Unfreeze a frozen bulk import session

Usage

td bulk_import:unfreeze <name>

Example

td bulk_import:unfreeze logs_201201

Result Commands

You can use the command line to list, create, show, and delete results.

td result list

Show list of result URLs

Usage

td result:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td result:list
td results

td result show

Describe information of a result URL.

Usage

td result:show <name>

Example

td result name

td result create

Create a result URL

Usage

td result:create <name> <URL>
OptionsDescription
-u, --user NAMEset user name for authentication
-p, --passwordask password for authentication

Example

td result:create name mysql://my-server/mydb

td result delete

Delete a result URL.

Usage

td result:delete <name>

Example

td result:delete name

Schedule Commands

You can use the command line to schedule, update, delete, and list queries.

td sched list

Show list of schedules

Usage

td sched:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td sched:list
td scheds

td sched create

Create a schedule

Usage

td sched:create <name> <cron> [sql]
OptionsDescription
-d, --database DB_NAMEuse the database (required)
-t, --timezone TZname of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option
-D, --delay SECONDSdelay time of the schedule
-r, --result RESULT_URLwrite result to the URL (see also result:create subcommand)
-u, --user NAMEset user name for the result URL
-p, --passwordask password for the result URL
-P, --priority PRIORITYset priority
-q, --query PATHuse file instead of inline query
-R, --retry COUNTautomatic retrying count
-T, --type TYPEset query type (hive)

Example

td sched:create sched1 "0 * * * *" -d example_db \
"select count(*) from table1" -r rset1

td sched:create sched1 "0 * * * *" \
-d example_db -q query.txt -r rset2

td sched delete

Delete a schedule

Usage

td sched:delete <name>

Example

td sched:delete sched1

td sched update

Modify a schedule

Usage

td sched:update <name>
OptionsDescription
-n, --newname NAMEchange the schedule's name
-s, --schedule CRONchange the schedule
-q, --query SQLchange the query
-d, --database DB_NAMEchange the database
-r, --result RESULT_URLchange the result target (see also result:create subcommand)
-t, --timezone TZname of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option
-D, --delay SECONDSchange the delay time of the schedule
-P, --priority PRIORITYset priority
-R, --retry COUNTautomatic retrying count
-T, --type TYPE --engine-version ENGINE_VERSIONset query type (hive) specify query engine version by name

Example

td sched:update sched1 -s "0 */2 * * *" -d my_db -t "Asia/Tokyo" -D 3600

td sched history

Show history of scheduled queries

Usage

td sched:history <name> [max]
OptionsDescription
-p, --page PAGEskip N pages
-s, --skip Nskip N schedules
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td sched sched1 --page 1

td sched run

Run scheduled queries for the specified time

Usage

td sched:run <name> <time>
OptionsDescription
-n, --num Nnumber of jobs to run
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td sched:run sched1 "2013-01-01 00:00:00" -n 6

td sched result

Show status and result of the last job ran. --last [N] enables showing the result before N from the last. The other options are identical to those of the job:show command.

Usage

td sched:result <name>
OptionsDescription
-v, --verboseshow logs
-w, --waitwait for finishing the job
-G, --verticaluse vertical table to show results
-o, --output PATHwrite result to the file
-l, --limit ROWSlimit the number of result rows shown when not outputting to file
-c, --column-headeroutput of the columns' header when the schema is available for the table (only applies to tsv and csv formats)
-x, --exclude --null STRINGdo not automatically retrieve the job result null expression in csv or tsv
-f, --format FORMAT --last [Number]format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz) show the result before N from the last. default: 1

Example

td sched:result NAME | sched:result NAME --last | sched:result NAME --last 3

Schema Commands

Use the command line to work with schema in a table.

td schema show

Show schema of a table

Usage

td schema:show <db> <table>

Example

td schema example_db table1

td schema set

Set new schema on a table

Usage

td schema:set <db> <table> [columns...]

Example

td schema:set example_db table1 user:string size:int

td schema add

Add new columns to a table.

Usage

td schema:add <db> <table> <columns...>

Example

td schema:add example_db table1 user:string size:int

td schema remove

Remove columns from a table

Usage

td schema:remove <db> <table> <columns...>

Example

td schema:remove example_db table1 user size

Connector Commands

You can use the command line to control several elements related to connectors.

td connector guess

Run guess to generate a connector configuration file. Using the connector's credentials, this command examines the data and attempts to determine the file type, delimiter character, and column names. This "guess" is then written to the configuration file for the connector. This command is useful for file-based connectors.

Usage

td connector:guess [config]
OptionsDescription
-type[=TYPE](obsoleted)
-access-id ID(obsoleted)
-access-secret SECRET(obsoleted)
-source SOURCE(obsoleted)
-o, --out FILE_NAMEoutput file name for connector:preview
-g, --guess NAME,NAME,...specify list of guess plugins that users want to use

Example

td connector:guess seed.yml -o config.yml

Example seed.yml

in:
  type: s3
  bucket: my-s3-bucket
  endpoint: s3-us-west-1.amazonaws.com
  path_prefix: path/prefix/to/import/
  access_key_id: ABCXYZ123ABCXYZ123
  secret_access_key: AbCxYz123aBcXyZ123
out:
  mode: append

td connector preview

Show a subset of possible data that the data connector fetches

Usage

td connector:preview <config>
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td connector:preview td-load.yml

td connector issue

Runs connector execution one time only

Usage

td connector:issue <config>
OptionsDescription
-database DB_NAMEdestination database
-table TABLE_NAMEdestination table
-time-column COLUMN_NAMEdata partitioning key
-w, --waitwait for finishing the job
-auto-create-tableCreate table and database if doesn't exist

Example

td connector:issue td-load.yml

td connector list

Shows a list of connector sessions

Usage

td connector:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td connector:list

td connector create

Creates a new connector session

Usage

td connector:create <name> <cron> <database> <table> <config>
OptionsDescription
-time-column COLUMN_NAMEdata partitioning key
-t, --timezone TZname of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option
-D, --delay SECONDSdelay time of the schedule

Example

td connector:create connector1 "0 * * * *" \
connector_database connector_table td-load.yml

td connector show

Shows the execution settings for a connector such as name, timezone, delay, database, table

Usage

td connector:show <name>

Example

td connector:show connector1

td connector update

Modify a connector session

Usage

td connector:update <name> [config]
OptionsDescription
-n, --newname NAMEchange the schedule's name
-d, --database DB_NAMEchange the database
-t, --table TABLE_NAMEchange the table
-s, --schedule [CRON]change the schedule or leave blank to remove the schedule
-z, --timezone TZname of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option
-D, --delay SECONDSchange the delay time of the schedule
-T, --time-column COLUMN_NAMEchange the name of the time column
-c, --config CONFIG_FILEupdate the connector configuration
--config-diff CONFIG_DIFF_FILupdate the connector config_diff

Example

td connector:update connector1 -c td-bulkload.yml -s '@daily' ...

td connector delete

Delete a connector session

Usage

td connector:delete <name>

Example

td connector:delete connector1

td connector history

Show the job history of a connector session

Usage

td connector:history <name>
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td connector:history connector1

td connector run

Run a connector session for the specified time option.

Usage

td connector:run <name> [time]
OptionsDescription
-w, --waitwait for finishing the job

Example

td connector:run connector1 "2016-01-01 00:00:00"

User Commands

You can use the command line to control several elements related to users.

td user list

Show a list of users.

Usage

td user:list
OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Example

td user:list

td user:list -f csv

td user show

Show a user.

Usage

td user:show <name>

Example

td user:show "Roberta Smith"

td user create

Create a user. As part of the user creation process, you will be prompted to provide a password for the user.

Usage

td user:create <first_name> --email <email_address>

Example

td user:create "Roberta" --email "roberta.smith@acme.com"

td user delete

Delete a user.

Usage

td user:delete <email_address>

Example

td user:delete roberta.smith@acme.com

td user apikey list

Show API keys for a user.

OptionsDescription
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

Usage

td user:apikey:list <email_address>

Example

td user:apikey:list roberta.smith@acme.com

td user:apikey:list roberta.smith@acme.com -f csv

td user apikey add

Add an API key to a user.

Usage

td user:apikey:add <email_address>

Example

td user:apikey:add roberta.smith@acme.com

td user apikey remove

Remove an API key from a user.

Usage

td user:apikey:remove <email_address> <apikey>

Example

td user:apikey:remove roberta.smith@acme.com 1234565/abcdefg

Workflow Commands

You can create or modify workflows from the CLI using the following commands. The command wf can be used interchangeably with workflow.

Basic Workflow Commands

td workflow reset

Reset the workflow module

Usage

td workflow:reset

td workflow:update

Update the workflow module

Usage

td workflow:update [version]

td workflow:version

Show workflow module version

Usage

td workflow:version

Local-mode commands

You can use the following commands to locally initiate changes to workflows.

Usage

td workflow <command> [options...]
OptionsDescription
init <dir>create a new workflow project
r[un] <workflow.dig>run a workflow
c[heck]show workflow definitions
sched[uler]run a scheduler server
migrate(run|check)migrate database
selfupdateupdate CLI to the latest version
Info

Secrets for local mode use the following command:

td workflow secrets --local

Server-mode commands

You can use the following commands to initiate changes to workflows from the server.

Usage

td workflow <command> [options...]
OptionsDescription
serverstart server

Client-mode commands

You can use the following commands to initiate changes to workflows from the client.

Usage

td workflow <command> [options...]
OptionsDescription
push <project-name>create and upload a new revision
download <project-name>pull an uploaded revision
start <project-name> <name>start a new session attempt of a workflow
retry <attempt-id>retry a session
kill <attempt-id>kill a running session attempt
backfill <schedule-id>start sessions of a schedule for past times
backfill <project-name> <name>start sessions of a schedule for past times
reschedule <schedule-id>skip sessions of a schedule to a future time
reschedule <project-name> <name>skip sessions of a schedule to a future time
projects [name]show projects
workflows [project-name] [name]show registered workflow definitions
schedulesshow registered schedules
disable <schedule-id>disable a workflow schedule
disable <project-name>disable all workflow schedules in a project
disable <project-name> <name>disable a workflow schedule
enable <schedule-id>enable a workflow schedule
enable <project-name>enable all workflow schedules in a project
enable <project-name> <name>enable a workflow schedule
sessionsshow sessions for all workflows
sessions <project-name>show sessions for all workflows in a project
sessions <project-name> <name>show sessions for a workflow
session <session-id>show a single session
attemptsshow attempts for all sessions
attempts <session-id>show attempts for a session
attempt <attempt-id>show a single attempt
tasks <attempt-id>show tasks of a session attempt
delete <project-name>delete a project
secrets --project <project-name>manage secrets
versionshow client and server version
parameterDescription
-L, --log PATHoutput log messages to a file (default: -)
-l, --log-level LEVELlog level (error, warn, info, debug or trace)
-X KEY=VALUEadd a performance system config
-c, --config PATH.propertiesConfiguration file (default: /Users/<user_name>/.config/digdag/config)
--versionshow client version

client options:

parameterDescription
-e, --endpoint URLServer endpoint
-H, --header KEY=VALUEAdditional headers
--disable-version-checkDisable server version check
--disable-cert-validationDisable certificate verification
--basic-auth <user:pass>Add an Authorization header with the provided username and password

Job Commands

You can view status and results of jobs, view lists of jobs and delete jobs using the CLI.

td job show

Show status and results of a job.

Usage

td job:show <job_id>

Example

td job:show 1461
OptionsDescription
-v, --verboseshow logs
-w, --waitwait for finishing the job
-G, --verticaluse vertical table to show results
-o, --output PATHwrite results to the file
-l, --limit ROWSlimit the number of result rows shown when not outputting to file
-c, --column-headeroutput of the columns' header when the schema is available for the table (only applies to tsv and csv formats)
-x, --excludedo not automatically retrieve the job result
--null STRINGnull expression in csv or tsv
-f, --format FORMATformat of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz)

td job status

Show status progress of a job.

Usage

td job:status <job_id>

Example

td job:status 1461

td job list

td job:list [max]

[max] is the number of jobs to show.

Example

OptionsDescription
-p, --page PAGEskip N pages
-s, --skip Nskip N jobs
-R, --runningshow only running jobs
-S, --successshow only succeeded jobs
-E, --errorshow only failed jobs
--slow [SECONDS]show slow queries (default threshold: 3600 seconds)
-f, --format FORMATformat of the result rendering (tsv, csv, json or table. default is table)

td job kill

Kill or cancel a job.

Usage

td job:kill <job_id>

Example

td job:kill 1461