The Treasure Data CLI allows you to create databases and tables, import/export data into/from the tables, set and modify the table schema, issue queries, monitor job status, view and download job results, create schedule queries, and much more.
Table of Contents
Step 1: Installation
Install the Treasure Data Toolbelt to set up your local workstation with
td, the Treasure Data command-line client.
If you are familiar with Ruby, you can install
td from the command line using
$ gem install td
Alternative installation options, including a Mac OSX installer, a Windows installer, or installing
td as part of
td-agent on Linux can be found on the Treasure Data Toolbelt page.
Step 2: Authorize
Once you have installed the toolbelt, you will have access to the
td command from your command line. Authorize your account with the
td account command. Please use the user name and password you used when signing up when prompted.
$ td account -f Enter your Treasure Data credentials. Email: email@example.com Password (typing will be hidden): Authenticated successfully.
Step 3: Query the Sample Dataset
Let’s issue a SQL query. Out of the box, we have a table called www_access in the dabase called sample_db. The following query calculates the distribution of HTTP status codes.
$ td query -w -d sample_db \ "SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code" queued... started at 2012-04-10T23:44:41Z 2012-04-10 23:43:12,692 Stage-1 map = 0%, reduce = 0% 2012-04-10 23:43:18,766 Stage-1 map = 100%, reduce = 0% 2012-04-10 23:43:29,925 Stage-1 map = 100%, reduce = 33% 2012-04-10 23:43:32,973 Stage-1 map = 100%, reduce = 100% Status : success Result : +------+------+ | code | cnt | +------+------+ | 404 | 17 | | 500 | 2 | | 200 | 4981 | +------+------+
The command above will take about 15-45 seconds, owing mainly to the overhead in setting up jobs within the cloud-based MapReduce engine.
Step 4: Import Data Into A Table
You’re now ready to import your real data to the cloud! The following tutorials will explain how to import your data (e.g. Application Logs, Middleware Logs) from various sources. For a deeper understanding of the platform, please refer to the architecture overview article.
This example shows how to use the CLI to generate a sample apache log in json format and import it into a brand new table in the ‘sample_db’ database.
$ td sample:apache sample_apache.json $ td table:import sample_db sample_tbl \ --auto-create-table -f json sample_apache.json
Languages and Frameworks
|Ruby or Rails||Java||Perl|
Depending on how the CLI was originally installed on your machine, there are different way it gets updated or can be updated.
Installed as Toolbelt on MACOSX or Windows
Whether you downloaded the CLI as a Toolbelt installer package (.pkg file) for Mac OSX or as a Toolbelt installer executable for Windows (64 bit support only) from the Treasure Data Toolbelt website and installed it on a MACOSX machine, the Toolbelt CLI is enabled with ability to self update itself.
Whenever a command is invoked from the CLI, the program will check whether a more updated version exists: if so it will download the updated version and install it in the background. The CLI will check for an updated version every hour. The user can at any time trigger an auto update with the following command:
$ td update
The auto update feature is available as of v0.10.77. If you are running an earlier version (please check the version with the
td --version command) please upgrade as soon as possbile by installing a more recent package from Treasure Data Toolbelt.
Installed as a Gem
If you installed the CLI as a gem (whether on Linux, Windows, or Mac OSX) through:
$ gem install td
you will need to periodically check whether a newer version exists. It is always reccomended to update to the latest version since we strive to maintain 100% backwards compatibilty. To update using the gem command run:
$ gem update td
|If are using a ruby environment manager such as `rbevn` or `rvm`, different versions of the td CLI may be confined within each environment/ruby version in use.
You will need to ensure each version and environment is updated independently.
Installed with td-agent
The easier way of installing the td CLI in a Linux environment beside using a gem (see above) is to install td as part of the
td-agent distribution package. Several Linux environments are supported, see Installing the Treasure Data CLI documentation page.
The Treasure Agent environment provisions its own gem environemnt and the corresponding gem command is accessible in this location
/usr/lib*/fluent/ruby/bin/fluent-gem. To update the td CLI to the latest version without having to wait until the next
td-agent release", please run:
$ /usr/lib*/fluent/ruby/bin/fluent-gem update td
td help:all shows the commands available in Treasure Data:
$ td help:all database:list # Show list of tables in a database database:show <db> # Describe a information of a database database:create <db> # Create a database database:delete <db> # Delete a database ....
If you want more information about individual commands, you can run
td help <command>:<subcommand>, e.g.,
$ td help table:list usage: $ td table:list [db] example: $ td table:list $ td table:list example_db $ td tables description: Show list of tables options: -n, --num_threads VAL number of threads to get list in parallel --show-bytes show estimated table size in bytes
See the td command line tool reference page for a complete list of commands and their helpers.