The Treasure Data CLI (‘Command Line Interface’ or ‘Toolbelt’) allows you to create databases and tables, import or export data into and from the tables, set and modify the table schema, issue queries, monitor job status, view and download job results, create schedule queries, and much more.
Query the Sample Dataset
Treasure Data comes by default with a table called www_access in the database called sample_datasets.
Run the following query to calculate the distribution of HTTP status codes:
$ td query -w -d sample_datasets "SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code"
Issue Idempotent Queries by Domain Key for your Batch
Beginning with td command v0.14, the td query
command supports a domain key
. By means of a domain key
, clients can ensure the submission of queries becomes idempotent.
You can run your query. For example:
$ td query -d sample_datasets --domain-key domainkey-test -T presto "select * from www_access" Job 92034375 is queued. Use 'td job:show 92034375' to show the status.
If the command has not responded with a job ID because of any API issue, you can issue the same query with the same domain key again safely.
$ td query -d sample_datasets --domain-key domainkey-test -T presto "select * from www_access"
Error: Query failed: ["Domain key has already been taken"]: conflicts_with job:92034375
Import Data into a Table
Importing your real data to the cloud from Application Logs, Middleware Logs, and other sources.
This example shows how to use the TD Toolbelt to generate a sample Apache log in JSON format and import it into a new table in the ‘own_database’ database.
Add the sample data.
$ td sample:apache sample_apache.json
Create
own_database
.$ td database:create own_database
Import the sample data into the own_database:
$ td table:import own_database sample_tbl --auto-create-table -f json sample_apache.json