Treasure Data Toolbelt: Command-line Interface
The Treasure Data CLI (‘Command Line Interface’ or ‘Toolbelt’) allows you to create databases and tables, import/export data into/from the tables, set and modify the table schema, issue queries, monitor job status, view and download job results, create schedule queries, and much more.
Table of Contents
Step 1: Installation & Update
Install the Treasure Data Toolbelt to set up your local workstation with
td, the Treasure Data command-line client. Please refer to the Installing the Treasure Data CLI documentation page for information on how to install the CLI in various environments.
The page also contains important information about Updating the
Step 2: Authorize
Once you’ve installed the toolbelt, you’ll have access to the
td command from your command line. Authorize your account with the
td account command. Please use the user name and password you used when signing up:
$ td -e https://api.treasuredata.com account -f Email: firstname.lastname@example.org Password (typing will be hidden): Authenticated successfully.
The endpoint will be saved in the ~/.td/td.conf file for future use onwards. If the account command was already run, this command will modify the endpoint saved in the td.conf file.
$ td server:endpoint https://api.treasuredata.com
Step 3: Query the Sample Dataset
Let’s issue an SQL query. Out of the box, we have a table called www_access in the dabase called sample_datasets. The following query calculates the distribution of HTTP status codes.
$ td query -w -d sample_datasets \ "SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code" queued... started at 2012-04-10T23:44:41Z 2012-04-10 23:43:12,692 Stage-1 map = 0%, reduce = 0% 2012-04-10 23:43:18,766 Stage-1 map = 100%, reduce = 0% 2012-04-10 23:43:29,925 Stage-1 map = 100%, reduce = 33% 2012-04-10 23:43:32,973 Stage-1 map = 100%, reduce = 100% Status : success Result : +------+------+ | code | cnt | +------+------+ | 404 | 17 | | 500 | 2 | | 200 | 4981 | +------+------+
The command above will take about 15-45 seconds, owing mainly to the overhead in setting up jobs within the cloud-based MapReduce engine.
Step 4: Import Data Into A Table
You’re now ready to import your real data to the cloud! The following tutorials will explain how to import your data (e.g. Application Logs, Middleware Logs) from various sources. For a deeper understanding of the platform, please refer to the architecture overview article.
This example shows how to use the CLI to generate a sample apache log in json format and import it into a brand new table in the ‘sample_datasets’ database.
$ td sample:apache sample_apache.json $ td table:import sample_datasets sample_tbl \ --auto-create-table -f json sample_apache.json
Languages and Frameworks
|Ruby or Rails||Java||Perl|
td help:all shows the commands available in Treasure Data:
$ td help:all database:list # Show list of tables in a database database:show <db> # Describe a information of a database database:create <db> # Create a database database:delete <db> # Delete a database ....
If you want more information about individual commands, you can run
td help <command>:<subcommand>, e.g.,
$ td help table:list usage: $ td table:list [db] example: $ td table:list $ td table:list example_db $ td tables description: Show list of tables options: -n, --num_threads VAL number of threads to get list in parallel --show-bytes show estimated table size in bytes
See the td command line tool reference page for a complete list of commands and their helpers.