Treasure Data’s bulk-export feature enables you to dump data into your Amazon S3 bucket.


Prerequisites

Limitations

If you do require partitioning, we recommend using this command to export 1-hour segments at a time – automating the process with a script.

Exporting Your Data to an Amazon S3 Bucket

We highly recommend that you use jsonl.gz or tsv.gz format, for specific performance optimizations.

The dump is performed through MapReduce jobs. The location of the bucket is expressed as an S3 path with the AWS public and private access keys embedded in it.

The td table:export command dumps all the data uploaded to Treasure Data into your Amazon S3 bucket.

  1. From a machine where your TD Toolbelt is installed, open a command line terminal.

  2. Optionally, use the following syntax to validate the latest usage information for the td table:export command.

    td table:export -help


  3. Use the bulk export command to start the bulk export. Specify the database and table from which to dump your data.

    td table:export <db> <table>


  4. Optionally, enter values for the options that you want to use. For example, options are:

    Amazon Resource Names (ARNs) uniquely identify AWS resources.

Examples

A simple bulk export syntax might look like:

td table:export example_db table1 \
 --s3-bucket mybucket \
 -k KEY_ID \
 -s SECRET_KEY


Typical bulk export syntax most likely contains the following options:


td table:export <database_name> <table_name> \
   --s3-bucket <S3_BUCKET_NAME> \
   --prefix <S3_FILE_PREFIX> \
   --aws-key-id <AWS_KEY> \
   --aws-secret-key <AWS_SECRET_KEY> \
   --file-format jsonl.gz


See also: