Skip to content
Last updated

Dynalyst Export Integration

Learn more about Dynalyst Import Integration.

You can export directly to Dynalyst. Dynalyst uses AWS S3 as a storage place, and the export process is similar to exporting job results to AWS S3.

Limitations

The query result limit for Result Output to S3 is 100GB. If the query result exceeds the limit, you see in the log:

The number of chunk for multipart upload is exceeded.

Try to split data by using a query or using the td table: export command.

Use the TD Console to Create your Connection

Completing Authentication Setup

  1. Review and complete the following in Dynalyst Import and Export Integration
    • Prerequisites
    • Create a New Connection
  2. Search for your Dynalyst Authentication.
  3. Select New Source.

Defining Export Results

Create or reuse a query. The export query defines the set of data that is exported to Dynalyst. You have the following options here:

  • Defining the Export Query Using TD Console
  • Defining the Export Query using TD Toolbelt

Sometimes you need to define the column mapping in the query.

Defining the Export Query Using TD Console

  1. Open TD Console.
  2. Navigate to Data Workbench > Queries.
  3. Access the query that you plan to use to export data
  4. Select Output Results. The Choose Saved Connection dialog opens.
  5. Type the connection name in the search box to filter and select your connection.
  6. Specify the parameters.

Defining the Export Query using TD Toolbelt

Using the TD Toolbelt to define your export query is optional. The export query defines the set of data that is exported to Dynalyst.

Considerations:

  • The access key and secret key must be URL encoded.
  • For security reasons, you might want to use AWS IAM to manage storage write access permissions.
  • You can specify the compression option (gz) in —result URL to compress the result.
  1. On the computer where you have TD Toolbelt installed and configured, open a terminal window.
  2. Use the TD Toolbelt query command to define a SQL query. For example, to compress the results of the query:
td query \
--result 'dynalyst://accesskey:secretkey@/bucketname/path/to/file.csv.gz?compression=gz' \
-w -d testdb \
"SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code"

After the job is finished, the results are written to the S3 bucket with the given name and path.

Example of Uncompressed Data

Without the compression parameter, the job generates uncompressed data. For example:

$ td query \
  --result 'dynalyst://accesskey:secretkey@/bucketname/path/to/file.csv' \
  -w -d testdb \
  "SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code"

Encrypting Upload Data with AWS S3 Server-Side

You can encrypt upload data with AWS S3 Server-Side Encryption. You don’t need to prepare an encryption key. Data is encrypted at server side with 256-bit Advanced Encryption Standard (AES-256).

Use the server-side encryption bucket policy if you require server-side encryption for all objects that are stored in your bucket. When you have server-side encryption enabled, you don't have to turn on the use_sse option. However, Job Results might fail if your bucket policies to reject HTTP requests without encryption information.

td query \
--result 'dynalyst://accesskey:secretkey@/bucketname/path/to/file.csv?use_sse=true&sse_algorithm=AES256' \
-w -d testdb \
"SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code"

Customizing the Final File Format

The default result output format is CSV RFC 4180. Output in TSV format is also supported. For both CSV and TSV formats, the following table lists options you can use to customize the final format of the files written into the destination:

NameDescriptionRestrictionsCSV defaultTSV default
formatMain setting to specify the file formatcsvcsv (Use ‘tsv’ to select the TSV format)
delimiterUse to specify the delimiter character, (comma)\t (tab)
quoteUse to specify the quote characternot available for TSV format“ (double quote)(no character)
escapeSpecifies the character used to escape other special charactersnot available for TSV format“ (double quote)(no character)
nullUse to specify how a ‘null’ value is displayed(empty string)\N (backslash capital n)
newlineUse to specify the EOL (End-Of-Line) representation\r\n (CRLF)\r\n (CRLF)
headerCan be used to suppress the column headercolumn header printed. Use ‘false’ to suppresscolumn header printed. Use ‘false’ to suppress

The following example shows a default sample output in CSV format when no customization is requested:

code,cnt
"200",4981
"302",
"404",17
"500",2

When the format=tsv, delimiter=“, and null=NULL options are specified:

td query \
--result 'dynalyst://accesskey:secretkey@/bucket_name/path/to/file.tsv?format=tsv&delimiter=%22&null=empty' \
-w -d testdb \
  "SELECT code, COUNT(1) AS cnt FROM www_access GROUP BY code"

The output changes to:

"code" "cnt"
"200" 4981
"302" NULL
"404" 17
"500" 2