Skip to content
Last updated

SFTP V2 Server Export Integration

This feature is in BETA version. For more information, contact your Customer Success Representative.

This integration enables uploading data to an SFTP server with support for

  • multiple file formats
  • data compression
  • server hand-shaking algorithms

Prerequisites and Limitations

Prerequisites

  • Basic knowledge of Treasure Data, including the toolbelt
  • An SFTP server

Limitation

  • Path prefix cannot contain any of the following characters: * ?

Static IP Address of Treasure Data Integration

If your security policy requires IP whitelisting, you must add Treasure Data's IP addresses to your allowlist to ensure a successful connection.

Please find the complete list of static IP addresses, organized by region, at the following link:
https://api-docs.treasuredata.com/en/overview/ip-addresses-integrations-result-workers/

Basic Usage

The following figure illustrates a typical usage scenario that involves exporting data from Treasure Data Agents into an SFTP or similar server.

Create a New Connection

You must create and configure the data connection on TD Console before running your query. As part of the data connection, you provide authentication to access the integration using the following steps:

  1. Open the TD Console.
  2. Navigate to Integrations Hub > Catalog.
  3. Search for SFTP_V2 and then select Create Authentication.
  4. Provide the connection credentials.

Authentication Fields

ParameterData TypeDescription
Host RequiredstringThe host information of the remote SFTP instance, for example, an IP address
Port RequiredstringThe connection port on the remote SFTP instance. The default port number is 22
User RequiredstringThe user name used to connect to the remote FTP instance
Authentication mode RequiredstringThis integration supports 2 types of authentication - Password:  Use a username and password for authentication - Public / Private key pair: use a public and private key pair for authentication
Password OptionalstringRequired if Password is selected as Authentication Mode
Secret key file OptionalstringRequired if public/private key pair is selected as Authentication Mode
Passphrase for secret key file OptionalstringThe passphrase of the provided secret file, for some algorithms that require
Retry limit RequiredintNumber of times to retry a failed connection (default 10)
Timeout RequiredintConnection timeout in seconds (default 600)
  1. Select Continue, enter a name for the Authentication.
  2. Select Done.

Configure a Query Result for Export

The TD Console supports multiple ways to export data. Complete the following steps to export data from the Data Workbench.

  1. Navigate to Data Workbench > Queries.
  2. Select New Query and define your query.
  3. Select Export Results and provide the data export configuration.
  4. Use an existing integration or create a new one.
  5. Configure the export parameters and select Done.

Export Configuration Parameters

Parameter Data Type Description
Is user directory Root? RequiredbooleanIf selected, the user directory is treated as the root directory (ex. '/home/treasure-data' as '/')
Path prefix RequiredstringThe file path where the file will be stored
Rename file after upload finish RequiredstringIf selected, the file is uploaded to the SFTP with a suffix .tmp, and the suffix will be removed after the data transfer is completed. Some data integration tools try to detect a file's existence to trigger their own task. This option is useful for such cases.
Format Requiredstring
  • csv - comma separated (default)
  • tsv - tab separated
Encoders Requiredstring
  • None - no encoder is applied (default)
  • GZ - the file is compressed using gzip before being uploaded
  • BZIP2 - the file is compressed using bzip2 before being uploaded
  • PGP Encryption - the file is encrypted using the public key before being uploaded
Public Key Required WHEN Encoder Is PGP encryptionstring
  • The public key is used to encrypt the file before being uploaded
Key Identifier Required WHEN Encoder Is PGP encryptionstring
  • Specifies the Key ID of the encryption subkey used to secure the file. The master key is excluded from the encryption process.
Amor optionalcheck box
  • Whether to use ASCII armor or not
Compression Type optionalstring
  • Defines the compression algorithm used to compress the file, which will be compressed before encryption for uploading to the SFTP server.
  • Note: Please ensure that you compress your file before encrypting and uploading. When you decrypt it, the file will return to a compressed format such as .gz or .bz2.
Header line? RequiredstringIf selected, the column names will be added as the first line.
Delimiter RequiredstringThe delimiter character:
  • Default - according to the selected Format
  • ,
  • Tab
  • |
Quote policy optionalstringThe policy for a quote:
  • ALL
  • MINIMAL: Add the quote character to only fields which contain delimiter, quote, or any of the end of line characters.
  • NONE
Null string RequiredstringHow null value of the result of the query displays:
  • Default - according to the selected Format
  • empty string
  • \N
  • NULL
  • null
End-of-line character RequiredstringThe EOL (end-of-line) character:
  • CRLF (default)
  • LF
  • CR
Temp filesize threshold RequiredlongIf the export data is bigger than this threshold, the upload is executed in batches with the batch size is the threshold itself. If the error channel is broken occurs, trying to reduce this figure might resolve the error.

Activate a Segment in Audience Studio

You can also send segment data to the target platform by creating an activation in the Audience Studio.

  1. Navigate to Audience Studio.
  2. Select a parent segment.
  3. Open the target segment, right-mouse click, and then select Create Activation.
  4. In the Details panel, enter an Activation name and configure the activation according to the previous section on Configuration Parameters.
  5. Customize the activation output in the Output Mapping panel.

  • Attribute Columns
    • Select Export All Columns to export all columns without making any changes.
    • Select + Add Columns to add specific columns for the export. The Output Column Name pre-populates with the same Source column name. You can update the Output Column Name. Continue to select + Add Columnsto add new columns for your activation output.
  • String Builder
    • + Add string to create strings for export. Select from the following values:
      • String: Choose any value; use text to create a custom value.
      • Timestamp: The date and time of the export.
      • Segment Id: The segment ID number.
      • Segment Name: The segment name.
      • Audience Id: The parent segment number.
  1. Set a Schedule.

  • Select the values to define your schedule and optionally include email notifications.
  1. Select Create.

If you need to create an activation for a batch journey, review Creating a Batch Journey Activation.

(Optional) Export Integration Using the CLI

The TD Toolbelt can trigger exporting the Query Result to SFTP from the CLI. You need to specify the parameters for the export job using the --result option of the td query command. For more information, refer to Querying and Importing Data to Treasure Data from the Command Line.

The format of the option is JSON and the general structure is as follows:

  • Authentication using user & password:
{
  "type": "sftp_v2",
  "host": "IP or host of SFTP server",
  "port": "The open port",
  "path_prefix": "/upload/2024Aug29/TC01.csv",
  "temp_file_threshold": "The size of tempfile",
  "timeout": 600,
  "user_directory_is_root": false,
  "rename_file_after_upload": true,
  "auth_method": "password",
  "username": "login user",
  "password": "password of user when auth_method is password"
}
  • Authentication using public/privater key-pair
{
  "type": "sftp_v2",
  "host": "IP or host of SFTP server",
  "port": "The open port",
  "path_prefix": "/upload/2024Aug29/TC01.csv",
  "temp_file_threshold": "The size of tempfile",
  "timeout": 600,
  "user_directory_is_root": false,
  "rename_file_after_upload": true,
  "auth_method": "key_pair",
  "secret_key_file": "Content of private key file",
  "secret_key_passphrase": "td@123"
}

CLI Parameters

Name Description Value Default Value Required
typeType must be sftp_v2StringNULLYES
hostThe address of the SFTP server specified as an IP address or domain name
  • 192.168.1.1
  • eu-west-1.sftpcloud.io
StringNULLYES
portThe port for the connection with the SFTP ServerNumber22NO
user_directory_is_rootOn some servers, users will have a home directory which is their root directory. For example, a user named john could have root directory name /home/john. If that user wanted to upload a file to a directory named /home/john/day1/my_data.csv, they would have two options:
  • set user_directory_is_root to true and use this path_prefix: /day1/my_data.csv
  • set user_directory_is_root to false and use this path_prefix: /home/john/day1/my_data.csv
StringFalseNO
path_prefixThe path where you want to upload the file on the SFTP server. The value of path_prefix depends on whether user_directory_is_root is true or false.StringNULLYES
rename_file_after_uploadDetermines whether the file is renamed after the upload completesBooleanFalseNO
temp_file_thresholdThe size of the temporary file to be used during export. When the temporary file size is reached, the integration reads and writes the data from that file to the remote file and the deletes the temporary file.Number5,368,709,120 (5 Gb)NO
auth_methodDetermines which authentication type will be used to authenticate with SFTP server:
  • Username/password
  • Public/Private key pair
Supported values:
  • password
  • key_pair
key_pairYES
usernameThe username that is used to authenticateStringNULLYES
passwordThe password that is use to authenticateStringNULLYES, when auth_method is password

secret_key_file

The private key file that is used to authenticate. The content of the file should be formatted as one line.

Example

td ... --result '{"type":"sftp_v2", "secret_key_file": "-----BEGIN OPENSSH PRIVATE KEY-----\nline1\nline2\nline3\n-----END OPENSSH PRIVATE KEY-----\n\n"}'

Content of private key file before formatting:

-----BEGIN OPENSSH PRIVATE KEY-----
line1
line2
line3
-----END OPENSSH PRIVATE KEY-----

After formatting

-----BEGIN OPENSSH PRIVATE KEY-----line1line2line3-----END OPENSSH PRIVATE KEY-----

String

NULL

YES, when auth_method is key_pair

secret_key_passphraseIf the private key was generated with a password, input that password here.StringNULLNO
sequence_formatThe suffix name of file being uploaded. For example, "%03d.%02d"StringBlankNO
formatThe output format of file being uploaded fileSupported values:
  • csv
  • tsv
csvNO
encodersSpecifies the encoder's type of the outputSupported values:
  • ""
  • gz
  • bzip2
  • encrypt_pgp
""NO
public_keyThe public key to use for encryptionString
key_identifierSpecifies the Key ID of the encryption subkey used to secure the file. The master key is excluded from the encryption process. (string, required)String
amorUse ASCII armor for the encrypted output (boolean)Boolean

compression_type

The compression type determines whether to compress the data before encrypting it.

Supported values

  • gzip
  • bzip2
  • none
  • zip_builtin
  • zlip_builtin
  • bzip2_builtin

Note: Please ensure that you compress your file before encrypting and uploading. When you decrypt it, the file will return to a compressed format such as .gz or .bz2.

delimiterColumn delimiterSupported values:
  • ","
  • "\t"
  • "tab"
  • "|"
","NO

quote_policy

The policy for handing quotes

Supported values:

  • ALL. If selected, all values are enclosed by double quotes ("").
  • MINIMAL. If selected, any value that contains an embedded quote (") is presented with a consecutive pair of quotes ("").
  • NONE. If selected, no escape for embedded quote is applied.

Example Code Policy ALL

Config:

quote_policy: ALL,
quote: "'",

Output

'id'|'column01'|'column02'|'column03'|'column04'|'column05'
'1'|'data01'|'data02'|'data03'|'data04'|'data05'

Example Code Policy MINIMAL

Config:

quote_policy: MINIMAL,
quote: "'",

Output

id|column01|column02|column03|column04|column05
1|data01|data02|data03|data04|data05

quote

Determines whether quoting will use single or double quotes.

Single Quote

Config

quote_policy: ALL,
quote: "'",

Output

'id'|'column01'|'column02'
'1'|'data01'|'data02'

Double Quote

Config

quote_policy: ALL,
quote: "\"",

Output

"id"|"column01"|"column02"
"1"|"data01"|"data02"
null_stringDetermines the default value for null column.Supported values:
  • ""
  • \N
  • NULL
""
newlineDetermines how new lines are started in CSV files.Supported values:
  • CRLF
  • LF
  • CR
CRLF

Example Usage

  1. Using password authentication

    td --database your_db --wait "SELECT email FROM (VALUES ('test01@test.com')) tbl(email)" \
    --type presto \
    --result '{"type":"sftp_v2","host":"your_ip","port": 22,"auth_method":"password","username":"user_name","password":"your_pass", "user_directory_is_root": true, "rename_file_after_upload": true,"path_prefix":"/sftp/2024aug/test.csv"}'
  2. Using password authentication and data compression

    td --database your_db --wait "SELECT email FROM (VALUES ('test01@test.com')) tbl(email)" \
    --type presto \
    --result '{"type":"sftp_v2","host":"your_ip","port": 22,"auth_method":"password","username":"user_name","password":"password","user_directory_is_root":true,"path_prefix":"/sftp/2024aug/test.csv","rename_file_after_upload":true,"format": "csv", "compression": "gz","header_line":true,"quote_policy":"ALL","delimiter":"\t","null_string":"","newline":"CRLF","temp_file_threshold":0,"sequence_format":".%01d%01d"}'
  3. Using Public/Private key authentication with password phrase

    Private key file content (password phrase is 123):

    -----BEGIN OPENSSH PRIVATE KEY-----
    line1
    line2
    line3
    -----END OPENSSH PRIVATE KEY-----

    Format for CLI:

    -----BEGIN OPENSSH PRIVATE KEY-----\nline1\nline2\nline3\n-----END OPENSSH PRIVATE KEY-----\n\n

    The CLI command:

    td --database your_db --wait "SELECT email FROM (VALUES ('test01@test.com')) tbl(email)" \
    --type presto \
    --result '{"type":"sftp_v2","host":"your_ip","port": 22,"auth_method":"key_pair","username":"user_name", "path_prefix": "/sftp/2024aug/test.csv","sequence_format":"","max_connection_retry":5,"secret_key_passphrase":"123","timeout":600,"secret_key_file":"-----BEGIN OPENSSH PRIVATE KEY-----\nline1\nline2\nline3\n-----END OPENSSH PRIVATE KEY-----\n\n"}'
  4. Create the new customer

    td --database luan_db --wait "SELECT customer_list_id,first_name,last_name,birthday,company_name,email,fax,gender,job_title,phone_business,phone_home,phone_mobile,second_name,credentials FROM (VALUES ('site_1','fname_201','lname_1','1990-05-12','Test Company',
  5. Using the encryption encoder

    td query -d luan_db -w "select * from luan_db.sfcc_500k" -T presto --result '{
      "type": "sftp_v2",
      "host": "your host server",
      "auth_method": "password",
      "username": "your user name",
      "password": "your password",
      "path_prefix": "cli/test_27_3_500K.csv.bz2.gpg",
      "file_ext": ".gpg",
      "rename_file_after_upload": false,
      "formatter": {
        "type": "csv",
        "quote_policy": "MINIMAL",
        "delimiter": ",",
        "null_string": "null",
        "newline": "\n"
      },
      "compression": "encrypt_pgp",
      "public_key": "your public key",
      "key_identifier": "your key identifier",
      "armor": true/false,
      "compression_type": "bzip2/gzip/none"
    }'

(Optional) Other considerations

Within Treasure Workflow, you can specify the use of this integration to export data. For sample workflows, view Treasure Boxes.

  • The Result Export can be scheduled to upload data to a target destination periodically
  • Most import and export integrations can be added to a TD Workflow to be involved in a more advanced data pipeline. Refer here for more detailed information.

References

FAQ for the SFTP Data Connector

Q: I cannot connect to my SFTP server. What can I do?

A: Here are some things to check:

  • Verify that your protocol is valid. If you intended to use SFTP, you can use this integration. If you intended to use FTP/FTPS, try to connect with the FTP Server Import Integration.
  • If you are using a firewall, check your accepted IP range and port. Server administrators sometimes change the default port number from TCP 22 for security reasons.
  • Be sure that your private key has an OpenSSH format. Treasure Data does not support other formats like “PuTTY”.
  • Treasure Data does not support the default format of the private key since OpenSSH 7.8. Re-generate the key using '-m PEM' option.

Q: I cannot find my uploaded file on the SFTP server. What happened?

A: If your SFTP server does not support Append mode, the integration may not be able to update the file with the error "SFTP Exception: no such file."  For stability reasons, the integration uploads the file content in batches. Consequently, you will need to enable the Append mode on your SFTP server. If Append mode cannot be enabled on your SFTP server, you will need to set temp_file_threshold to 0 to bypass this feature.

Note: Please ensure that you compress your file before encrypting and uploading.

  1. When you decrypt using non-built-in encryption, the file will return to a compressed format such as .gz or .bz2.

  2. When you decrypt using built-in encryption, the file will return to raw data.