In this tutorial, using the command line interface, you run your first workflow of two Treasure Data Presto jobs, one that runs right after the other.
Update TD Toolbelt and Install TD Workflow
Use the TD Toolbelt to interact with Treasure Data’s many services. If not already installed and configured, complete the following commands in your terminal.
Complete the instructions in Installing and Updating.
Create the workflow_temp Database
Create the following database in your Treasure Data account.
Run this command using TD Toolbelt.
The workflow_temp database is created.
Download the Sample Workflow Project
Download your first workflow project directory. The download includes a sample workflow and Presto SQL commands.
Navigate to the download directory. X
Download the sample workflow project.
Extract the project.
Navigate to the workflow project directory:
Review the Contents of Your Workflow File
Print the contents of the workflow file.
Verify that the workflow that prints is made up of 3 sections, timezone, export and tasks. For example:
In section 1, you see the definition for on what interval the workflow will run
In section 2, you see how to specify the Treasure Data database for which the workflow will run.
In section 3, you see that the workflow definition has two tasks.
+ signifies a new task. The text that follows before the
: is the name you give the task.
td> signifies that the query that follows will run against Treasure Data. This is automatically set to run a Presto query. The
> signifies that this is where the “action” part of the task is defined—the specific processing to run.
create_table:___ parameter drops a table if it exists and creates a table that creates the new table based on the output of the task’s query.
Run the Workflow
Typically, when developing your workflow, you start by editing a workflow from your local machine. You can run and iterate on workflow steps that all occur within the TD environment while creating the workflow definition and execution pattern locally.
This workflow creates two tables in the workflow_temp database:
Optionally, before running your first workflow, open your TD Console Job Activities page so you can see the execution when it happens.
Run the sample workflow once from your local machine.
Review the TD Console Job Activities page for nasdag_analysis.
Use the command line to verify that the daily_open table was created as expected:
Use the command line to verify that the monthly_open table was created as expected:
Register and Schedule the Workflow
Scheduling workflows to run on a regular basis is a common task. Your workflow already contains the schedule definition.
Review the scheduling syntax in your workflow:
Register the workflow with Treasure Data.
The workflow will run every day at 7 am UTC.
List the Workflows Registered with Treasure Data
From the command line you can list all the workflows defined in your Treasure Data environment.
To retrieve a list of projects and workflows, type the following:
Use the following syntax to see the definition of your submitted workflow: