Learn more about Repro Import Integration.

You can use the new Repro Export Integration connector to export files to your Repro’s Amazon S3 buckets with customized parameters for an easy configuration.

Prerequisites

  • Basic knowledge of Treasure Data, including the TD Toolbelt.

  • An S3 bucket with region ID.

Using the TD Console to Create Your Connection

Create a New Connection

When you create a data connection, you must provide authentication to access the integration. In Treasure Data, you configure the authentication first and then specify the source information.

  1. Open TD Console.

  2. Navigate to Integrations HubCatalog

  3. Search for and select Repro.

  4. The following dialog opens:



  5. Enter a name for your connection and click Done



Configure Export Results in Your Data Connection

In this step you create or reuse a query. In the query, you configure the data connection. You may need to define the column mapping in the query.

Configure the Connection by Specifying the Parameters

  1. Open the TD console.

  2. Navigate to Data Workbench > Queries.

  3. Select the query that you plan to use to export data.

  4. Click Export Results located at top of your query editor. The Choose Integration dialog opens. You have two options when selecting a connection to use to export the results, using an existing connection or creating a new one.

Use an existing connection

  1. Type the connection name in the search box to filter.

  2. Select your connection.

Create a new Repro Connection.

  1. Fill in the field values to create the new connection.

  2. Enter the required credentials for your new connection. Set the following parameters.

Parameter

Description

Use AWS S3 Server-Side Encryption (optional)

Use S3 Server-side encryption

Server-Side Encryption algorithm: (optional)

The algorithm used for encryption

Bucket (required)

The bucket name in s3

File Path (required)

The full path of the file includes file name and extension, i.e: production/<app_id>/user-list/filename.csv.gz

Format (required)

File export to csv, required extension in the file path

Header line (required)

The export file will contain first row as columns name

Null String (optional)

The value that will replace null in the file

End-of-line character (optional)

Character marked as end-of-line in the file

Quote Policy (required)

Quote policy for the file

Compression (required)

Compress file in gz, required extension in the file path

Here is a sample configuration:


Example of a Query to Populate Repro

From Treasure Data, run the following query with export results into a connection for Repro:

Code Example

SELECT an_email_column AS EMAIL,
another_phone_column AS PHONE
FROM your_table;

Optional: Use of Scheduled Jobs for Export

You can use Scheduled Jobs with Result Export, to periodically write the output result to a target destination that you specify.

Optional: Configure Export Results in Workflow

Within Treasure Workflow, you can specify the use of this data connector to export data.

timezone: UTC

_export:
  td:
    database: sample_datasets

+td-result-into-target:
  td>: queries/sample.sql
  result_connection: your_connections_name
  result_settings:
      type: repro
      bucket: bucket_name
      region: ap-northeast-2
      use_sse: true
      sse_algorithm: AES256
      auth_method: basic
      session_token: session_token
      path: /td-export-repro/file_output.csv
      access_key_id: access_id
      secret_access_key: secret_key
      formatter: {type: csv, delimiter: "\t", newline: CRLF, newline_in_field: LF, charset: UTF-8,
      quote_policy: MINIMAL, quote: '"', escape: \, null_string: \N, default_timezone: UTC}
      encoders: {type: gzip}

Click here for more information on using data connectors in workflow to export data.


  • No labels