Bulk Export

This article explains Treasure Data’s bulk-export feature, which lets you dump data into your Amazon S3 bucket.

At Treasure Data, we believe that your data belongs to you, even after importing it to our platform. We believe that vendor-lockin MUST be stopped.

Untitled-3
(We're limiting export capability to only us-east region S3 bucket.) If you would like to unlock this limitation, please contact to our support.

Table of Contents

Prerequisites

  • Basic knowledge of Treasure Data, including the Treasure Data Toolbelt.
  • Amazon AWS account and Amazon S3 bucket.

Table Dump

The td table:export command will dump all the data uploaded to TD into your Amazon S3 bucket. Please specify the database and table from which to dump your data.

$ td table:export database_name table_name \
   --s3-bucket <S3_BUCKET_NAME> \
   --prefix <S3_FILE_PREFIX> \
   --aws-key-id <AWS_KEY> \
   --aws-secret-key <AWS_SECRET_KEY> \
   --file-format line-json.gz
Untitled-3
We highly recommend to use line-json.gz or tsv.gz format, because we have specific performance optimizations. Other formats are way slower.

The dump is performed via MapReduce jobs. Where the location of the bucket is expressed as an S3 path with the AWS public and private access keys embedded in it.

usage:
  $ td table:export <db> <table>

example:
  $ td table:export example_db table1 --s3-bucket mybucket -k KEY_ID -s SECRET_KEY

description:
  Dump logs in a table to the specified storage

options:
  -w, --wait                       wait until the job is completed
  -f, --from TIME                  export data which is newer than or same with the TIME (unixtime e.g. 1446617523)
  -t, --to TIME                    export data which is older than the TIME (unixtime e.g. 1480383205)
  -b, --s3-bucket NAME             name of the destination S3 bucket (required)
  -p, --prefix PATH                path prefix of the file on S3 (exported datafiles are stored under this folder)
  -k, --aws-key-id KEY_ID          AWS access key id to export data (required)
  -s, --aws-secret-key SECRET_KEY  AWS secret access key to export data (required)
  -F, --file-format FILE_FORMAT    file format for exported data, either json.gz (default), line-json.gz or tar.gz
  -O, --pool-name NAME             specify resource pool by name
  -e, --encryption ENCRYPT_METHOD  export with server side encryption with the ENCRYPT_METHOD

Support Server-side Encryption

Server-side encryption is about protecting data at rest. Our Bulk Export supports some of Server-side encryption.

Now, td table:export command with --encryption ENCRYPT_METHOD is able to dump all the data uploaded to TD into your encrypted storage. This option is available in td command since version 0.14.0.

The following command is a example for x-amz-server-side-encryption: AES256 on S3:

  $ td table:export example_db table1 -F jsonl.gz --s3-bucket mybucket -k KEY_ID -s SECRET_KEY --encryption s3

References


Last modified: Dec 05 2016 02:31:52 UTC

If this article is incorrect or outdated, or omits critical information, please let us know. For all other issues, please see our support channels.