Visit our new documentation site! This documentation page is no longer updated.

Workflows Introductory Tutorial

Table of Contents


In this tutorial we will run a workflow that includes importing data to Treasure Data, and then running a presto query that results in the output being sent out of Treasure Data.

Data Connector

You use data connectors to import data into Treasure Data. For this tutorial, you will use the SFTP Data Connector.

Result Export

For exporting data out of Treasure Data, you will use our Result Output to run a query and submit the query results to another service. In this tutorial, you will use Result Output to SFTP.

TD Workflows Introductory Tutorial

If you haven’t already, complete the Introductory Tutorial to install TD Workflows and submit a sample workflow project to Treasure Data.


Download a Project and Review the Workflow

Download a Project

# Download the sample project
$ cd ~/Downloads
$ curl -o ftp_pull_push.tar.gz -L
    # downloads the sample project

# Extract the downloaded project
$ tar xzvf ftp_pull_push.tar.gz

# Enter the workflow project directory
$ cd ftp_pull_push

The workflow you download is as follows:

    database: workflow_temp

  td_load>: imports/sftp_load.yml
  table: pull_push_tutorial

  td>: queries/count_rows.sql
  result_url: ftp://wf_test:***
This Data Connection yml configuration is not operational. You must replace the provided yml with your own configuration to run this project yourself.

Data Connector Operator

The first difference you may notice is the use of the td_load> operator. This operator runs a Data Connector job with a pre-created yml config.

The td_load> operator needs both the database: and table: defined somewhere in the workflow file.

  td_load>: imports/sftp_load.yml
  table: pull_push_tutorial

Create required tables

In order to run any Data Connector job in Treasure Data, you must first have the database and table that you’re loading data into created. Use the following command:

$ td table:create workflow_temp pull_push_tutorial

TD Query Operator with Result Output

For exporting data out of Treasure Data, you run a normal presto query using the td> operator, and then also include the parameter result_url: for defining the third party source to send the result of the query to.

  td>: queries/count_rows.sql
  result_url: ftp://wf_test:***
You can also use a saved result output (a "favorite") that you created in the console. The following examples shows how to use a saved query.

If you use the saved query, the configuration looks like this:

  td>: queries/count_rows.sql
  result_url: my_saved_output_name

You can create a saved output. In order to create a saved output you:

  1. Go to create a new query
  2. Select “add” in the result export section
  3. Choose the service that you want to send your data to under “Export to:”
  4. Then put in your credentials and add the name under
  5. Click “Add to Favorites”

Now you can use the text string of your new favorite output within your workflow.

Submit Workflow to Treasure Data

$ td wf push ftp_pull_push

Execute the Workflow on Treasure Data

When learning to use TD Workflows, it can be helpful to also follow along on the [jobs]( page in Treasure Data while running your workflows.

Run the following command to execute the workflow already submitted to Treasure Data

$ td wf start ftp_pull_push pull_push --session now

And that’s it! Now try creating your own.


If you have any feedback we welcome hearing your thoughts on our TD Workflows ideas forum.

Also, if you have any ideas or feedback on the tutorial itself, we’d welcome them here!

Last modified: May 08 2017 04:52:45 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.