# TD Toolbelt Reference You can run Treasure Data from the command line using these commands. | Command | Example | | --- | --- | | [Basic Commands](#basic-commands) | `td` | | [Database Commands](#database-commands) | `td db:create ` | | [Table Commands](#table-commands) | `td table:list [db]` | | [Query Commands](#query-commands) | `td query [sql]` | | [Import Commands](#import-commands) | `td import:list` | | [Bulk Import Commands](#bulk-import-commands) | `td bulk_import:list` | | [Result Commands](#result-commands) | `td result:list` | | [Schedule Commands](#schedule-commands) | `td sched:list` | | [Schema Commands](#schema-commands) | `td schema:show ` | | [Connector Commands](#connector-commands) | `td connector:guess [config]` | | [User Commands](#user-commands) | `td user:list` | | [Workflow Commands](#workflow-commands) | `td workflow init` | | [Job Commands](#job-commands) | `td job:show ` | ## Basic Commands You can use the following commands to enable basic functions in Treasure Data. + [td](#td) + [Additional commands](#additional-commands) ### td Show list of options in Treasure Data. **Usage** ```bash td ``` | Options | Description | | --- | --- | | `-c, --config PATH` | Path to the configuration file (default: ~/.td/td.conf) | | `-k, --apikey KEY` | Use this API key instead of reading the config file | | `-e, --endpoint API_SERVER` | Specify the URL for API server to use (default: https://api.treasuredata.com). The URL must contain a scheme (http:// or https:// prefix) to be valid. | | `--insecure` | Insecure access: disable SSL/TLS verification. Insecure mode is disabled by default. | | `-v, --verbose` | Verbose mode | | `-r, --retry-post-requests` | Retry on failed post requests. Warning: can cause resource duplication, such as duplicated job submissions. | | `--version` | Show version | ### Additional Commands **Usage** ```bash td ``` | Options | Description | | --- | --- | | `db` | create/delete/list databases | | `table` | create/delete/list/import/export/tail tables | | `query` | issue a query | | `job` | show/kill/list jobs | | `import` | manage bulk import sessions (Java based fast processing) | | `bulk_import` | manage bulk import sessions (Old Ruby-based implementation) | | `result` | create/delete/list result URLs | | `sched` | create/delete/list schedules that run a query periodically | | `schema` | create/delete/modify schemas of tables | | `connector` | manage connectors | | `workflow` | manage workflows | | `status` | show scheds, jobs, tables and results | | `apikey` | show/set API key | | `server` | show status of the Treasure Data server | | `sample` | create a sample log file | | `help` | show help messages | ## Database Commands You can create, delete, and view lists of databases from the command line. + [td db:create](#td-db-create) + [td db:delete](#td-db-delete) + [td db:list](#td-db-list) ### td db create Create a database. **Usage** ``` td db:create ``` **Example** ```bash td db:create example_db ``` ### td db delete Delete a database. **Usage** ```bash td db:delete ``` | Options | Description | | --- | --- | | `-f, --force` | clear tables and delete the database | **Example** ```bash td db:delete example_db ``` ### td db list Show list of tables in a database. **Usage** ```bash td db:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ``` td db:list td dbs ``` ## Table Commands You can create, list, show, and organize table structure using the command line. + [td table:list](#td-table-list) + [td table:show](#td-table-show) + [td table:create](#td-table-create) + [td table:delete](#td-table-delete) + [td table:import](#td-table-import) + [td table:export](#td-table-export) + [td table:swap](#td-table-swap) + [td table:rename](#td-table-rename) + [td table:tail](#td-table-tail) + [td table:expire](#td-table-expire) ### td table list Show list of tables. **Usage** ```bash td table:list [db] ``` | Options | Description | | --- | --- | | `-n, --num_threads VAL` | number of threads to get list in parallel | | `--show-bytes` | show estimated table size in bytes | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td table:list td table:list example_db td tables ``` ### td table show Describe information in a table. **Usage** ```bash td table:show
``` | Options | Description | | --- | --- | | `-v` | show more attributes | **Example** ```bash td table example_db table1 ``` ### td table create Create a table. **Usage** ```bash td table:create
``` | Options | Description | | --- | --- | | `-T, --type TYPE` `--expire-days DAYS` `--include-v BOOLEAN` `--detect-schema BOOLEAN` | set table type (log) set table expire days set include_v flag set detect schema flag | **Example** ```bash td table:create example_db table1 ``` ### td table delete Delete a table. **Usage** ```bash td table:delete
``` | Options | Description | | --- | --- | | `-f, --force` | never prompt | **Example** ```bash td table:delete example_db table1 ``` ### td table import Parse and import files to a table **Usage** ```bash td table:import
``` | Options | Description | | --- | --- | | `--format FORMAT` | file format (default: apache) | | `--apache` | same as --format apache; apache common log format | | `--syslog` | same as --format syslog; syslog | | `--msgpack` | same as --format msgpack; msgpack stream format | | `--json` | same as --format json; LF-separated json format | | `-t, --time-key COL_NAME` | time key name for json and msgpack format (e.g. 'created_at') | | `--auto-create-table` | Create table and database if doesn't exist | **Example** ```bash td table:import example_db table1 --apache access.log td table:import example_db table1 --json -t time - < test.json ``` #### How is the import command’s time format set in a windows batch file? `%` is a recognized environment variable, so you must use ‘%%’ to set it. ```bash td import:prepare --format csv --column-header \ --time-column 'date' --time-format '%%Y-%%m-%%d' test.csv ``` ### td table export Dump logs in a table to the specified storage **Usage** ```bash td table:export
``` | Options | Description | | --- | --- | | `-w, --wait` | wait until the job is completed | | `-f, --from TIME` | export data which is newer than or same with the TIME | | `-t, --to TIME` | export data which is older than the TIME | | `-b, --s3-bucket NAME` | name of the destination S3 bucket (required) | | `-p, --prefix PATH` | path prefix of the file on S3 | | `-k, --aws-key-id KEY_ID` | AWS access key id to export data (required) | | `-s, --aws-secret-key SECRET_KEY` | AWS secret access key to export data (required) | | `-F, --file-format FILE_FORMAT` | file format for exported data. Available formats are tsv.gz (tab-separated values per line) and jsonl.gz (JSON record per line). The json.gz and line-json.gz formats are default and still available but only for backward compatibility purpose;use is discouraged because they have far lower performance. | | `-O, --pool-name NAME` | specify resource pool by name | | `-e, --encryption ENCRYPT_METHOD` | export with server side encryption with the ENCRYPT_METHOD | | `-a ASSUME_ROLE_ARN,` `--assume-role` | export with assume role with ASSUME_ROLE_ARN as role arn | **Example** ```bash td table:export example_db table1 \ --s3-bucket mybucket -k KEY_ID -s SECRET_KEY ``` ### td table swap Swap the names of two tables. **Usage** ```bash td table:swap ``` **Example** ```bash td table:swap example_db table1 table2 ``` ### td table rename Rename the existing table. **Usage** ```bash td table:rename ``` | Options | Description | | --- | --- | | `--overwrite` | replace existing dest table | **Example** ```bash td table:rename example_db table1 table2 ``` ### td table tail **Get recently imported logs.** **Usage** ```bash td table:tail
``` | Options | Description | | --- | --- | | `-n, --count N` | number of logs to get | | `-P, --pretty` | pretty print | **Example** ```bash td table:tail example_db table1 td table:tail example_db table1 -n 30 ``` ### td table expire Expire data in table after specified number of days. Set to `0` to disable the expiration. **Usage** ```bash td table:expire
``` **Example** ```bash td table:expire example_db table1 30 ``` ## Query Commands You can issue queries from the command line. + [td query](#td-query) ### td query Issue a query **Usage** ```bash td query [sql] ``` | Options | Description | | --- | --- | | `-d, --database DB_NAME` | use the database (required) | | `-w, --wait[=SECONDS]` | wait for finishing the job (for seconds) | | `-G, --vertical` | use vertical table to show results | | `-o, --output PATH` | write result to the file | | `-f, --format FORMAT` | format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz) | | `-r, --result RESULT_URL` | write result to the URL (see also result:create subcommand) It is suggested for this option to be used with the -x / --exclude option to suppress printing of the query result to stdout or -o / --output to dump the query result into a file. | | `-u, --user NAME` | set user name for the result URL | | `-p, --password` | ask password for the result URL | | `-P, --priority PRIORITY` | set priority | | `-R, --retry COUNT` | automatic retrying count | | `-q, --query PATH` | use file instead of inline query | | `-T, --type TYPE` | set query type (hive, trino(presto)) | | `--sampling DENOMINATOR` | OBSOLETE - enable random sampling to reduce records 1/DENOMINATOR | | `-l, --limit ROWS` | limit the number of result rows shown when not outputting to file | | `-c, --column-header` | output of the columns' header when the schema is available for the table (only applies to json, tsv and csv formats) | | `-x, --exclude` | do not automatically retrieve the job result | | `-O, --pool-name NAME` | specify resource pool by name | | `--domain-key DOMAIN_KEY` | optional user-provided unique ID. You can include this ID with your `create` request to ensure idempotence. | | `--engine-version ENGINE_VERSION` | specify query engine version by name | **Example** ```bash td query -d example_db -w -r rset1 "select count(*) from table1" td query -d example_db -w -r rset1 -q query.txt ``` ## Import Commands You can import and organize data from the command line using these commands. + [td import:list](#td-import-list) + [td import:show](#td-import-show) + [td import:create](#td-import-create) + [td import:jar_version](#td-import-jar-version) + [td import:jar_update](#td-import-jar-update) + [td import:prepare](#td-import-prepare) + [td import:upload](#td-import-upload) + [td import:auto](#td-import-auto) + [td import:perform](#td-import-perform) + [td import:error_records](#td-import-error-records) + [td import:commit](#td-import-commit) + [td import:delete](#td-import-delete) + [td import:freeze](#td-import-freeze) + [td import:unfreeze](#td-import-unfreeze) + [td import:config](#td-import-config) ### td import list List bulk import sessions **Usage** ```bash td import:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ``` td import:list ``` ### td import show Show list of uploaded parts. **Usage** ``` td import:show ``` **Example** ```bash td import:show ``` ### td import create Create a new bulk import session to the table **Usage** ```bash td import:create
``` **Example** ```bash td import:create logs_201201 example_db event_logs ``` ### td import jar version Show import jar version **Usage** ```bash td import:jar_version ``` **Example** ```bash td import:jar_version ``` ### td import jar update Update import jar to the latest version **Usage** ```bash td import:jar_update ``` **Example** ```bash td import:jar_update ``` ### td import prepare Convert files into part file format **Usage** ```bash td import:prepare ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv | | `-C, --compress TYPE` | compressed type [gzip, none, auto]; default=auto detect | | `-T, --time-format FORMAT` | specifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the'%:z' and '%::z' timezone options are not supported. | | `-e, --encoding TYPE` | encoding type [UTF-8, etc.] | | `-o, --output DIR` | output directory. default directory is 'out'. | | `-s, --split-size SIZE_IN_KB` | size of each parts (default: 16384) | | `-t, --time-column NAME` | name of the time column | | `--time-value TIME,HOURS` | time column's value. If the data doesn't have a time column,users can auto-generate the time column's value in 2 ways: Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp >= TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. | | `--primary-key NAME:TYPE` | pair of name and type of primary key declared in your item table | | `--prepare-parallel NUM` | prepare in parallel (default: 2; max 96) | | `--only-columns NAME,NAME,...` | only columns | | `--exclude-columns NAME,NAME,...` | exclude columns | | `--error-records-handling MODE` | error records handling mode [skip, abort]; default=skip | | `--invalid-columns-handling MODE` | invalid columns handling mode [autofix, warn]; default=warn | | `--error-records-output DIR` | write error records; default directory is 'error-records'. | | `--columns NAME,NAME,...` | column names (use --column-header instead if the first line has column names) | | `--column-types TYPE,TYPE,...` | column types [string, int, long, double] | | `--column-type NAME:TYPE` | column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' | | `S, --all-string` | disable automatic type conversion | | `--empty-as-null-if-numeric` | the empty string values are interpreted as null values if columns are numerical types. | **CSV/TSV Specific Options** | Options | Description | | --- | --- | | `--column-header` | first line includes column names | | `--delimiter CHAR` | delimiter CHAR; default="," at csv, "\t" at tsv | | `--escape CHAR` | escape CHAR; default=\ | | `--newline TYPE` | newline [CRLF, LF, CR]; default=CRLF | | `--quote CHAR` | quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE | **MySQL Specific Options** | Options | Description | | --- | --- | | `--db-url URL` | JDBC connection URL | | `--db-user NAME` | user name for MySQL account | | `--db-password PASSWORD` | password for MySQL account | **REGEX Specific Options** | Options | Description | | --- | --- | | `--regex-pattern PATTERN` | pattern to parse line. When 'regex' is used as source file format, this option is required | **Example** ```bash td import:prepare logs/*.csv --format csv \ --columns date_code,uid,price,count --time-value 1394409600,10 -o parts/ td import:prepare mytable --format mysql \ --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:prepare "s3://:@/my_bucket/path/to/*.csv" \ --format csv --column-header --time-column date_time -o parts/ ``` ### td import upload **Upload or re-upload files into a bulk import session** **Usage** ```bash td import:upload ``` | Options | Description | | --- | --- | | `--retry-count NUM` | upload process will automatically retry at specified time; default: 10 | | `--auto-create DATABASE.TABLE` | create automatically bulk import session by specified database and table names If you use 'auto-create' option, you MUST not specify any session name as first argument. | | `--auto-perform` | perform bulk import job automatically | | `--auto-commit` | commit bulk import job automatically | | `--auto-delete` | delete bulk import session automatically | | `--parallel NUM` | upload in parallel (default: 2; max 8) | | `-f, --format FORMAT` | source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv | | `-C, --compress TYPE` | compressed type [gzip, none, auto]; default=auto detect | | `-T, --time-format FORMAT` | specifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported. | | `-e, --encoding TYPE` | encoding type [UTF-8, etc.] | | `-o, --output DIR` | output directory. default directory is 'out'. | | `-s, --split-size SIZE_IN_KB` | size of each parts (default: 16384) | | `-t, --time-column NAME` | name of the time column | | `--time-value TIME,HOURS` | time column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: Fixed time value with -time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with -time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp >= TIME + HOURS 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. | | `--primary-key NAME:TYPE` | pair of name and type of primary key declared in your item table | | `--prepare-parallel NUM` | prepare in parallel (default: 2; max 96) | | `--only-columns NAME,NAME,...` | only columns | | `--exclude-columns NAME,NAME,...` | exclude columns | | `--error-records-handling MODE` | error records handling mode [skip, abort]; default=skip | | `--invalid-columns-handling MODE` | invalid columns handling mode [autofix, warn]; default=warn | | `--error-records-output DIR` | write error records; default directory is 'error-records'. | | `--columns NAME,NAME,...` | column names (use --column-header instead if the first line has column names) | | `--column-types TYPE,TYPE,...` | column types [string, int, long, double] | | `--column-type NAME:TYPE` | column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' | | `-S, --all-string` | disable automatic type conversion | | `--empty-as-null-if-numeric` | the empty string values are interpreted as null values if columns are numerical types. | **CSV/TSV Specific Options** | Options | Description | | --- | --- | | `--column-header` | first line includes column names | | `--delimiter CHAR` | delimiter CHAR; default="," at csv, "\t" at tsv | | `--escape CHAR` | escape CHAR; default=\ | | `--newline TYPE` | newline [CRLF, LF, CR]; default=CRLF | | `--quote CHAR` | quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE | **MySQL Specific Options** | Options | Description | | --- | --- | | `--db-url URL` | JDBC connection URL | | `--db-user NAME` | user name for MySQL account | | `--db-password PASSWORD` | password for MySQL account | **REGEX Specific Options** | Options | Description | | --- | --- | | `--regex-pattern PATTERN` | pattern to parse line. When 'regex' is used as source file format, this option is required | **Example** ```bash td import:upload mysess parts/* --parallel 4 td import:upload mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/ td import:upload parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/ td import:upload mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:upload "s3://:@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/ ``` ### td import auto Automatically upload or re-upload files into a bulk import session. It's functional equivalent of 'upload' command with 'auto-perform', 'auto-commit' and 'auto-delete' options. But it, by default, doesn't provide 'auto-create' option. If you want 'auto-create' option, you explicitly must declare it as command options. **Usage** ```bash td import:auto ``` | Options | Description | | --- | --- | | `--retry-count NUM` | upload process will automatically retry at specified time; default: 10 | | `--auto-create DATABASE.TABLE` | create automatically bulk import session by specified database and table names If you use 'auto-create' option, you MUST not specify any session name as first argument. | | `--parallel NUM` | upload in parallel (default: 2; max 8) | | `-f, --format FORMAT` | source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv | | `-C, --compress TYPE` | compressed type [gzip, none, auto]; default=auto detect | | `-T, --time-format FORMAT` | specifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported. | | `-e, --encoding TYPE` | encoding type [UTF-8, etc.] | | `-o, --output DIR` | output directory. default directory is 'out'. | | `-s, --split-size SIZE_IN_KB` | size of each parts (default: 16384) | | `-t, --time-column NAME` | name of the time column | | ` --time-value TIME,HOURS` | time column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp tooverflow the range (timestamp >= TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there.E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. | | `--primary-key NAME:TYPE` | pair of name and type of primary key declared in your item table | | `--prepare-parallel NUM` | prepare in parallel (default: 2; max 96) | | `--only-columns NAME,NAME,...` | only columns | | `--exclude-columns NAME,NAME,...` | exclude columns | | `--error-records-handling MODE` | error records handling mode [skip, abort]; default=skip | | `--invalid-columns-handling MODE` | invalid columns handling mode [autofix, warn]; default=warn | | `--error-records-output DIR` | write error records; default directory is 'error-records'. | | `--columns NAME,NAME,...` | column names (use --column-header instead if the first line has column names) | | `--column-types TYPE,TYPE,...` | column types [string, int, long, double] | | `--column-type NAME:TYPE` | column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' | | `-S, --all-string` | disable automatic type conversion | | ` --empty-as-null-if-numeric` | the empty string values are interpreted as null values if columns are numerical types. | **CSV/TSV Specific Options** | Options | Description | | --- | --- | | `--column-header` | first line includes column names | | `--delimiter CHAR` | delimiter CHAR; default="," at csv, "\t" at tsv | | `--escape CHAR` | escape CHAR; default=\ | | `--newline TYPE` | newline [CRLF, LF, CR]; default=CRLF | | `--quote CHAR` | quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE | **MySQL Specific Options** | Options | Description | | --- | --- | | `--db-url URL` | JDBC connection URL | | `--db-user NAME` | user name for MySQL account | | `--db-password PASSWORD` | password for MySQL account | **REGEX Specific Options** | Options | Description | | --- | --- | | `--regex-pattern PATTERN` | pattern to parse line. When 'regex' is used as source file format, this option is required | **Example** ```bash td import:auto mysess parts/* --parallel 4 td import:auto mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/ td import:auto parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/ td import:auto mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:auto "s3://:@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/ ``` ### td import perform Start to validate and convert uploaded files **Usage** ```bash td import:perform ``` | Options | Description | | --- | --- | | `-w, --wait` | wait for finishing the job | | `-f, --force` | force start performing | | `-O, --pool-name NAME` | specify resource pool by name | **Example** ```bash td import:perform logs_201201 ``` ### td import error records Show records which did not pass validations **Usage** ```bash td import:error_records ``` **Example** ```bash td import:error_records logs_201201 ``` ### td import commit **Start to commit a performed bulk import session** **Usage** ```bash td import:commit ``` | Options | Description | | --- | --- | | `-w, --wait` | wait for finishing the commit | **Example** ```bash td import:commit logs_201201 ``` ### td import delete p Delete a bulk import session **Usage** ```bash td import:delete ``` **Example** ```bash td import:delete logs_201201 ``` ### td import freeze p Pause any further data upload for a bulk import session/Reject succeeding uploadings to a bulk import session **Usage** ```bash td import:freeze ``` **Example** ```bash td import:freeze logs_201201 ``` ### td import unfreeze p Unfreeze a bulk import session **Usage** ```bash td import:unfreeze ``` **Example** ```bash td import:unfreeze logs_201201 ``` ### td import config create guess config from arguments **Usage** ```bash td import:config ``` | Options | Description | | --- | --- | | `-o, --out FILE_NAME` | output file name for connector:guess | | `-f, --format FORMAT` | source file format [csv, tsv, mysql]; default=csv | | `--db-url URL` | Database Connection URL | | `--db-user NAME` | user name for database | | `--db-password PASSWORD` | password for database | | `--columns COLUMNS` | not supported | | `--column-header COLUMN-HEADER` | not supported | | `--time-column TIME-COLUMN` | not supported | | `--time-format TIME-FORMAT` | not supported | **Example** ```bash td import:config "s3://:@/my_bucket/path/to/*.csv" -o seed. ``` ## Bulk Import Commands You can create and organize bulk imports from the command line. + [td bulk_import:list](#td-bulk-import-list) + [td bulk_import:show ](#td-bulk-import-show) + [td bulk_import:create
](#td-bulk-import-create) + [td bulk_import:prepare_parts ](#td-bulk-import-prepare-parts) + [td bulk_import:upload_parts ](#td-bulk-import-upload-parts) + [td bulk_import:delete_parts ](#td-bulk-import-delete-parts) + [td bulk_import:perform ](#td-bulk-import-perform) + [td bulk_import:error_records ](#td-bulk-import-error-records) + [td bulk_import:commit ](#td-bulk-import-commit) + [td bulk_import:delete ](#td-bulk-import-delete) + [td bulk_import:freeze ](#td-bulk-import-freeze) + [td bulk_import:unfreeze ](#td-bulk-import-unfreeze) For instructions on how to use the bulk import commands, refer to the [Bulk Import API Tutorial](https://api-docs.treasuredata.com/en/api/td-api/bulk-import-tutorial/#bulk-import-api-tutorial). ### td bulk import list List bulk import sessions **Usage** ```bash td bulk_import:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td bulk_import:list ``` ### td bulk import show Shows a list of uploaded parts **Usage** ```bash td bulk_import:show ``` **Example** ```bash td bulk_import:show logs_201201 ``` ### td bulk import create Creates a new bulk import session to the table **Usage** ```bash td bulk_import:create
``` **Example** ### td bulk import prepare parts Converts files into part file format **Usage** ```bash td bulk_import:prepare_parts ``` | Options | Description | | --- | --- | | `-f, --format NAME` | source file format [csv, tsv, msgpack, json] | | `-h, --columns NAME,NAME,...` | column names (use --column-header instead if the first line has column names) | | `-H, --column-header` | first line includes column names | | `-d, --delimiter REGEX` `--null REGEX` `--true REGEX` `--false REGEX` | delimiter between columns (default: (?-mix:\t|,)) null expression for the automatic type conversion (default: (?i-mx:\A(?:null||-|\N)\z)) true expression for the automatic type conversion (default: (?i-mx:\A(?:true)\z)) false expression for the automatic type conversion (default: (?i-mx:\A(?:false)\z)) | | `-S, --all-string` | disable automatic type conversion | | `-t, --time-column NAME` | name of the time column | | `-T, --time-format FORMAT` | strftime(3) format of the time column | | `-time-value TIME` | value of the time column | | `-e, --encoding NAME` | text encoding | | `-C, --compress NAME` | compression format name [plain, gzip] (default: auto detect) | | `-s, --split-size SIZE_IN_KB` | size of each parts (default: 16384) | | `-o, --output DIR` | output directory | **Example** ```bash td bulk_import:prepare_parts logs/*.csv --format csv \ --columns time,uid,price,count --time-column "time" -o parts/ ``` ### td bulk import upload parts Uploads or re-uploads files into a bulk import session **Usage** ```bash td bulk_import:upload_parts ``` | Options | Description | | --- | --- | | `-P, --prefix NAME` | add prefix to parts name | | `-s, --use-suffix COUNT` `--auto-perform ` `--parallel NUM` | use COUNT number of . (dots) in the source file name to the parts name perform bulk import job automatically perform uploading in parallel (default: 2; max 8) | | `-O, --pool-name NAME` | specify resource pool by name | **Example** ### td bulk import delete parts p Delete uploaded files from a bulk import session **Usage** ```bash td bulk_import:delete_parts ``` | Options | Description | | --- | --- | | `-P, --prefix NAME ` | add prefix to parts name | **Example** ```bash td bulk_import:delete_parts logs_201201 01h 02h 03h ``` ### td bulk import perform Start to validate and convert uploaded files **Usage** ```bash td bulk_import:perform ``` | Options | Description | | --- | --- | | `-w, --wait` `-f, --force` `-O, --pool-name NAME` | wait for finishing the job force start performing specify resource pool by name | **Example** ```bash td bulk_import:perform logs_201201 ``` ### td bulk import error records Show records which did not pass validations **Usage** ```bash td bulk_import:error_records ``` **Example** ```bash td bulk_import:error_records logs_201201 ``` ### td bulk import commit Start to commit a performed bulk import session **Usage** ```bash td bulk_import:commit ``` | Options | Description | | --- | --- | | `-w, --wait` | wait for finishing the commit | **Example** ```bash td bulk_import:commit logs_201201 ``` ### td bulk import delete Delete a bulk import session **Usage** ```bash td bulk_import:delete ``` **Example** ```bash td bulk_import:delete logs_201201 ``` ### td bulk import freeze Block the upload to a bulk import session **Usage** ```bash td bulk_import:freeze ``` **Example** ```bash td bulk_import:freeze logs_201201 ``` ### td bulk import unfreeze Unfreeze a frozen bulk import session **Usage** ```bash td bulk_import:unfreeze ``` **Example** ```bash td bulk_import:unfreeze logs_201201 ``` ## Result Commands You can use the command line to list, create, show, and delete results. + [td result:list](#td-result-list) + [td result:show](#td-result-show) + [td result:create](#td-result-create) + [td result:delete](#td-result-delete) ### td result list Show list of result URLs **Usage** ```bash td result:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td result:list td results ``` ### td result show Describe information of a result URL. **Usage** ```bash td result:show ``` **Example** ```bash td result name ``` ### td result create **Create a result URL** **Usage** ```bash td result:create ``` | Options | Description | | --- | --- | | `-u, --user NAME` | set user name for authentication | | `-p, --password` | ask password for authentication | **Example** ```bash td result:create name mysql://my-server/mydb ``` ### td result delete Delete a result URL. **Usage** ```bash td result:delete ``` **Example** ```bash td result:delete name ``` ## Schedule Commands You can use the command line to schedule, update, delete, and list queries. + [td sched:list](#td-sched-list) + [td sched:create](#td-sched-create) + [td sched:delete](#td-sched-delete) + [td sched:update](#td-sched-update) + [td sched:history](#td-sched-history) + [td sched:run](#td-sched-run) + [td sched:result](#td-sched-result) ### td sched list Show list of schedules **Usage** ```bash td sched:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td sched:list td scheds ``` ### td sched create p Create a schedule **Usage** ```bash td sched:create [sql] ``` | Options | Description | | --- | --- | | `-d, --database DB_NAME` | use the database (required) | | `-t, --timezone TZ` | name of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option | | `-D, --delay SECONDS` | delay time of the schedule | | `-r, --result RESULT_URL` | write result to the URL (see also result:create subcommand) | | `-u, --user NAME` | set user name for the result URL | | `-p, --password` | ask password for the result URL | | `-P, --priority PRIORITY` | set priority | | `-q, --query PATH` | use file instead of inline query | | `-R, --retry COUNT` | automatic retrying count | | `-T, --type TYPE` | set query type (hive) | **Example** ```bash td sched:create sched1 "0 * * * *" -d example_db \ "select count(*) from table1" -r rset1 td sched:create sched1 "0 * * * *" \ -d example_db -q query.txt -r rset2 ``` ### td sched delete p Delete a schedule **Usage** ```bash td sched:delete ``` **Example** ```bash td sched:delete sched1 ``` ### td sched update p Modify a schedule **Usage** ```bash td sched:update ``` | Options | Description | | --- | --- | | `-n, --newname NAME` | change the schedule's name | | `-s, --schedule CRON` | change the schedule | | `-q, --query SQL` | change the query | | `-d, --database DB_NAME` | change the database | | `-r, --result RESULT_URL` | change the result target (see also result:create subcommand) | | `-t, --timezone TZ` | name of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option | | `-D, --delay SECONDS` | change the delay time of the schedule | | `-P, --priority PRIORITY` | set priority | | `-R, --retry COUNT` | automatic retrying count | | `-T, --type TYPE` `--engine-version ENGINE_VERSION` | set query type (hive) specify query engine version by name | **Example** ```bash td sched:update sched1 -s "0 */2 * * *" -d my_db -t "Asia/Tokyo" -D 3600 ``` ### td sched history p Show history of scheduled queries **Usage** ```bash td sched:history [max] ``` | Options | Description | | --- | --- | | `-p, --page PAGE` | skip N pages | | `-s, --skip N` | skip N schedules | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ``` td sched sched1 --page 1 ``` ### td sched run Run scheduled queries for the specified time **Usage** ```bash td sched:run
``` **Example** ```bash td schema example_db table1 ``` ### td schema set Set new schema on a table **Usage** ```bash td schema:set
[columns...] ``` **Example** ```bash td schema:set example_db table1 user:string size:int ``` ### td schema add Add new columns to a table. **Usage** ```bash td schema:add
``` **Example** ```bash td schema:add example_db table1 user:string size:int ``` ### td schema remove Remove columns from a table **Usage** ```bash td schema:remove
``` **Example** ```bash td schema:remove example_db table1 user size ``` ## Connector Commands You can use the command line to control several elements related to connectors. + [td connector:guess](#td-connector-guess) + [td connector:preview](#td-connector-preview) + [td connector:issue](#td-connector-issue) + [td connector:list](#td-connector-list) + [td connector:create](#td-connector-create) + [td connector:show](#td-connector-show) + [td connector:update](#td-connector-update) + [td connector:delete](#td-connector-delete) + [td connector:history](#td-connector-history) + [td connector:run](#td-connector-run) ### td connector guess Run `guess` to generate a connector configuration file. Using the connector's credentials, this command examines the data and attempts to determine the file type, delimiter character, and column names. This "guess" is then written to the configuration file for the connector. This command is useful for file-based connectors. **Usage** ```bash td connector:guess [config] ``` | Options | Description | | --- | --- | | `-type[=TYPE]` | (obsoleted) | | `-access-id ID` | (obsoleted) | | `-access-secret SECRET` | (obsoleted) | | `-source SOURCE` | (obsoleted) | | `-o, --out FILE_NAME` | output file name for connector:preview | | `-g, --guess NAME,NAME,...` | specify list of guess plugins that users want to use | **Example** ```bash td connector:guess seed.yml -o config.yml ``` **Example seed.yml** ```yaml in: type: s3 bucket: my-s3-bucket endpoint: s3-us-west-1.amazonaws.com path_prefix: path/prefix/to/import/ access_key_id: ABCXYZ123ABCXYZ123 secret_access_key: AbCxYz123aBcXyZ123 out: mode: append ``` ### td connector preview p Show a subset of possible data that the data connector fetches **Usage** ```bash td connector:preview ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td connector:preview td-load.yml ``` ### td connector issue Runs connector execution one time only **Usage** ```bash td connector:issue ``` | Options | Description | | --- | --- | | `-database DB_NAME` | destination database | | `-table TABLE_NAME` | destination table | | `-time-column COLUMN_NAME` | data partitioning key | | `-w, --wait` | wait for finishing the job | | `-auto-create-table` | Create table and database if doesn't exist | **Example** ```bash td connector:issue td-load.yml ``` ### td connector list Shows a list of connector sessions **Usage** ```bash td connector:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td connector:list ``` ### td connector create p Creates a new connector session **Usage** ```bash td connector:create
``` | Options | Description | | --- | --- | | `-time-column COLUMN_NAME` | data partitioning key | | `-t, --timezone TZ` | name of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option | | `-D, --delay SECONDS` | delay time of the schedule | **Example** ```bash td connector:create connector1 "0 * * * *" \ connector_database connector_table td-load.yml ``` ### td connector show p Shows the execution settings for a connector such as name, timezone, delay, database, table **Usage** ```bash td connector:show ``` **Example** ```bash td connector:show connector1 ``` ### td connector update p Modify a connector session **Usage** ```bash td connector:update [config] ``` | Options | Description | | --- | --- | | `-n, --newname NAME` | change the schedule's name | | `-d, --database DB_NAME` | change the database | | `-t, --table TABLE_NAME` | change the table | | `-s, --schedule [CRON]` | change the schedule or leave blank to remove the schedule | | `-z, --timezone TZ` | name of the timezone. Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option | | `-D, --delay SECONDS` | change the delay time of the schedule | | `-T, --time-column COLUMN_NAME` | change the name of the time column | | `-c, --config CONFIG_FILE` | update the connector configuration | | `--config-diff CONFIG_DIFF_FIL` | update the connector config_diff | **Example** ```bash td connector:update connector1 -c td-bulkload.yml -s '@daily' ... ``` ### td connector delete p Delete a connector session **Usage** ```bash td connector:delete ``` **Example** ```bash td connector:delete connector1 ``` ### td connector history p Show the job history of a connector session **Usage** ```bash td connector:history ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td connector:history connector1 ``` ### td connector run Run a connector session for the specified time option. **Usage** ```bash td connector:run [time] ``` | Options | Description | | --- | --- | | `-w, --wait` | wait for finishing the job | **Example** ```bash td connector:run connector1 "2016-01-01 00:00:00" ``` ## User Commands You can use the command line to control several elements related to users. + [td user:list](#td-user-list) + [td user:show](#td-user-show) + [td user:create](#td-user-create) + [td user:delete](#td-user-delete) + [td user:apikey:list](#td-user-apikey-list) + [td user:apikey:add](#td-user-apikey-add) + [td user:apikey:remove](#td-user-apikey-remove) ### td user list Show a list of users. **Usage** ```bash td user:list ``` | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Example** ```bash td user:list td user:list -f csv ``` ### td user show Show a user. **Usage** ```bash td user:show ``` **Example** ```bash td user:show "Roberta Smith" ``` ### td user create Create a user. As part of the user creation process, you will be prompted to provide a password for the user. **Usage** ```bash td user:create --email ``` **Example** ```bash td user:create "Roberta" --email "roberta.smith@acme.com" ``` ### td user delete Delete a user. **Usage** ```bash td user:delete ``` **Example** ```bash td user:delete roberta.smith@acme.com ``` ### td user apikey list Show API keys for a user. | Options | Description | | --- | --- | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | **Usage** ```bash td user:apikey:list ``` **Example** ```bash td user:apikey:list roberta.smith@acme.com td user:apikey:list roberta.smith@acme.com -f csv ``` ### td user apikey add Add an API key to a user. **Usage** ```bash td user:apikey:add ``` **Example** ```bash td user:apikey:add roberta.smith@acme.com ``` ### td user apikey remove Remove an API key from a user. **Usage** ```bash td user:apikey:remove ``` **Example** ```bash td user:apikey:remove roberta.smith@acme.com 1234565/abcdefg ``` ## Workflow Commands You can create or modify workflows from the CLI using the following commands. The command wf can be used interchangeably with workflow. + [Basic Workflow Commands](#basic-workflow-commands) + [Local-mode commands](#local-mode-commands) + [Server-mode commands](#server-mode-commands) + [Client-mode commands](#client-mode-commands) ### Basic Workflow Commands #### td workflow reset Reset the workflow module **Usage** ```bash td workflow:reset ``` #### td workflow:update Update the workflow module **Usage** ```bash td workflow:update [version] ``` #### td workflow:version p Show workflow module version **Usage** ```bash td workflow:version ``` ### Local-mode commands You can use the following commands to locally initiate changes to workflows. **Usage** ```bash td workflow [options...] ``` | Options | Description | | --- | --- | | `init ` | create a new workflow project | | `r[un] ` | run a workflow | | `c[heck]` | show workflow definitions | | `sched[uler]` | run a scheduler server | | `migrate(run|check)` | migrate database | | `selfupdate` | update CLI to the latest version | Info Secrets for local mode use the following command: `td workflow secrets --local` ### Server-mode commands You can use the following commands to initiate changes to workflows from the server. **Usage** ```bash td workflow [options...] ``` | Options | Description | | --- | --- | | `server` | start server | ### Client-mode commands You can use the following commands to initiate changes to workflows from the client. **Usage** ```bash td workflow [options...] ``` | Options | Description | | --- | --- | | `push ` | create and upload a new revision | | `download ` | pull an uploaded revision | | `start ` | start a new session attempt of a workflow | | `retry ` | retry a session | | `kill ` | kill a running session attempt | | `backfill ` | start sessions of a schedule for past times | | `backfill ` | start sessions of a schedule for past times | | `reschedule ` | skip sessions of a schedule to a future time | | `reschedule ` | skip sessions of a schedule to a future time | | `projects [name]` | show projects | | `workflows [project-name] [name]` | show registered workflow definitions | | `schedules` | show registered schedules | | `disable ` | disable a workflow schedule | | `disable ` | disable all workflow schedules in a project | | `disable ` | disable a workflow schedule | | `enable ` | enable a workflow schedule | | `enable ` | enable all workflow schedules in a project | | `enable ` | enable a workflow schedule | | `sessions` | show sessions for all workflows | | `sessions ` | show sessions for all workflows in a project | | `sessions ` | show sessions for a workflow | | `session ` | show a single session | | `attempts` | show attempts for all sessions | | `attempts ` | show attempts for a session | | `attempt ` | show a single attempt | | `tasks ` | show tasks of a session attempt | | `delete ` | delete a project | | `secrets --project ` | manage secrets | | `version` | show client and server version | | parameter | Description | | --- | --- | | `-L, --log PATH` | output log messages to a file (default: -) | | `-l, --log-level LEVEL` | log level (error, warn, info, debug or trace) | | `-X KEY=VALUE` | add a performance system config | | `-c, --config PATH.properties` | Configuration file (default: /Users//.config/digdag/config) | | `--version` | show client version | p client options: | parameter | Description | | --- | --- | | `-e, --endpoint URL` | Server endpoint | | `-H, --header KEY=VALUE` | Additional headers | | `--disable-version-check` | Disable server version check | | `--disable-cert-validation` | Disable certificate verification | | `--basic-auth ` | Add an Authorization header with the provided username and password | ## Job Commands You can view status and results of jobs, view lists of jobs and delete jobs using the CLI. + [td job:show](#td-job-show) + [td job:status](#td-job-status) + [td job:list](#td-job-list) + [td job:kill](#td-job-kill) ### td job show Show status and results of a job. **Usage** ```bash td job:show ``` **Example** ```bash td job:show 1461 ``` | Options | Description | | --- | --- | | `-v, --verbose` | show logs | | `-w, --wait` | wait for finishing the job | | `-G, --vertical` | use vertical table to show results | | `-o, --output PATH` | write results to the file | | `-l, --limit ROWS` | limit the number of result rows shown when not outputting to file | | `-c, --column-header` | output of the columns' header when the schema is available for the table (only applies to tsv and csv formats) | | `-x, --exclude` | do not automatically retrieve the job result | | `--null STRING` | null expression in csv or tsv | | `-f, --format FORMAT` | format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz) | ### td job status Show status progress of a job. **Usage** ```bash td job:status ``` **Example** ```bash td job:status 1461 ``` ### td job list ```bash td job:list [max] ``` [max] is the number of jobs to show. **Example** | Options | Description | | --- | --- | | `-p, --page PAGE` | skip N pages | | `-s, --skip N` | skip N jobs | | `-R, --running` | show only running jobs | | `-S, --success` | show only succeeded jobs | | `-E, --error` | show only failed jobs | | `--slow [SECONDS]` | show slow queries (default threshold: 3600 seconds) | | `-f, --format FORMAT` | format of the result rendering (tsv, csv, json or table. default is table) | ### td job kill Kill or cancel a job. **Usage** ```bash td job:kill ``` **Example** ```bash td job:kill 1461 ```