Use this data connector to directly import data from your FTP server to Treasure Data.

For sample workflows on importing data from your FTP server, view Treasure Boxes.


Prerequisites

Requirements

About Incremental Data Loading


Use the TD Console to Create Your Connection

Create a New Connection

In Treasure Data, you must create and configure the data connection prior to running your query. As part of the data connection, you provide authentication to access the integration.

Open TD Console.
Navigate to Integrations Hub  Catalog.
Search for and select FTP.

     Select Create Authentication.


Enter the required credentials for your remote FTP instance. Depending on your selections, the fields you see might vary:


FieldDescription
Host

The host information of the remote FTP instance, for example, an IP address.

Port

The connection port on the remote FTP instance the default is 21.

User

The user name used to connect to the remote FTP instance.

Password

The password used to connect to the remote FTP instance.

Passive mode

Use passive mode (default: checked)

ASCII mode

Use ASCII mode instead of binary mode (boolean, default: unchecked)

Use FTPS/FTPES

Use FTPS (SSL encryption). (boolean, default: unchecked)

Verify cert

Verify the certification provided by the server. By default, the connection fails if the server certificate is not signed by one of the CAs in JVM's default trusted CA list.

Verify hostname

Verify server's hostname matches the provided certificate.

Enable FTPESFTPES is a security extension to FTPS
SSL CA Cert Content

Paste the contents of the certificate file


Select Continue.
Enter a name for your connection.
Select Continue.



Transfer Your Data to Treasure Data


After creating the authenticated connection, you are automatically taken to Authentications.


Search for the connection you created. 
Select New Source.
Type a name for your Source in the Data Transfer field.
Select Next.

The Source Table dialog opens.

Edit the following parameters:
Parameters Description
Path prefix

The prefix of target files (string, required). 

For example, resultoutputtest.

Path regex

Type a regular expression to query file paths. If a file path doesn’t match the specified pattern, the file is skipped. For example, if you specify the pattern  .csv$ #, then a file is skipped if its path doesn’t match the pattern.

Incremental

 Enables incremental loading (boolean, optional. default: true. If incremental loading is enabled, the config diff for the next execution will include last_path parameter so that the next execution skips files before the path. Otherwise, last_path is not included.

Start after pathOnly paths lexicographically greater than this will be imported.
Select Next.

The Data Settings page can be modified for your needs or you can skip the page.


Optionally, edit the parameters.


Select Next.


Filters 


Data Preview 



Data Placement


Validate Connection

Review the job log. Warning and errors provide information about the success of your import. For example, you can identify the source file names associated with import errors.

Optionally Configure Export Results in Workflow

Within Treasure Workflow, you can specify the use of this data connector to export data.

Learn more at Using Workflows to Export Data with the TD Toolbelt.

Example Workflow for FTP


timezone: UTC

schedule:
  daily>: 02:00:00

sla:
  time: 08:00
  +notice:
    mail>: {data: Treasure Workflow Notification}
    subject: This workflow is taking long time to finish
    to: [meg@example.com]

_export:
  td:
    dest_db: dest_db
    dest_table: dest_table
  ftp:
    ssl: true
    ssl_verify: false

+prepare_table:
  td_ddl>:
  database: ${td.dest_db}
  create_tables: ["${td.dest_table}"]

+load_step:
  td_load>: config/daily_load.yml
  database: ${td.dest_db}
  table: ${td.dest_table}