Scheduling Workflows

Table of Contents

Introduction

This tutorial shows you how to create a schedule for your workflow, and then submit the schedule to Treasure Data.

The workflow functionality is also partially described in the onboarding tutorial. You might want to complete the onboarding tutorial, before completing this tutorial if you haven’t yet installed and started using TD Workflows.

Tutorial

Add schedule information to your workflow

To add a schedule to your workflow, add the following at the top of your workflow file.

timezone: UTC

schedule:
  daily>: 07:00:00

Timezone

The default value is UTC. Specify timezone using tz database time zones. Some examples of valid time zones are: America/Los_Angeles, Europe/Berlin, Asia/Tokyo

Schedule

You can choose one of following options:

Syntax Description Example
minutes_interval>: M Run this job every M minutes minutes_interval>: 30

This example specifies that the job runs every 30 minutes. For example, if the job started at 6:10 am., then the job runs again at 6:40, 7:10, 7:40 and so on.
hourly>: MM:SS Run this job every hour at this MM:SS

Hourly, +MM mins SS secs
hourly>: 25:00
OR
hourly>:25

Hourly, +25 minutes

This example specifies that the job runs every hour, 25 minutes into the hour. For example, 8:25, 9:25, 10:25 and so on.
daily>: HH:MM:SS Run this job every day at this HH:MM:SS

Daily, @HH:MM:SS AM/PM
daily>: 13:30:00
OR
daily>: 13:30

This example specifies that the job runs every day at 1:30 p.m.

Tip: If you want to run your job at midnight each day, you specify 00:00. If you want to specify 30 minutes past midnight, you enter 00:30. If you want to specify 30 minutes after the noon hour, you enter 12:30.
weekly>: DDD,HH:MM:SS Run this job every week on DDD at HH:MM:SS

Every DDD, @HH:MM:SS AM/PM
weekly>: Sun,09:00:00
OR
weekly>: Sun,09:00
OR
weekly>: Sun,09

This example specifies every week on Sunday, run the job at 9:00 a.m.
monthly>: D,HH:MM:SS Run this job every month on D at HH:MM:SS

every D of month, @HH:MM:SS AM/PM
monthly>: 1,09:15:00
OR
monthly>: 1,09:15

This example specifies on the first day of each month, run the job at 9:15 a.m. If you wanted to specify 9:15 p.m., you type:
monthly>: 1,21:15
cron> CRON Use cron format for complex scheduling cron>: 42 4 1 * *

This example specifies 42 minutes, 4 hours and day 1 of the month.

Tip: You are not required to specify hours, minutes, or seconds (HH, MM or SS). You might even save some processing time if you omit HH, MM and SS. For example, if you specify daily then the job runs once per day. The job runs and then 24 hours later, runs again. If you specify weekly then the job runs once per week. The job runs and then 7 days later, runs again at the same time of day that the job ran initially.

Submit the workflow to Treasure Data to run on the scheduled basis

Now that you’ve created a workflow, you want it to run as you scheduled. Run this command to submit the workflow to Treasure Data:

$ td wf push <project_name>

That’s it! Now your workflow will run at the scheduling interval you set.

List the workflows registered on Treasure Data

$ td wf workflows

Find out what workflows are scheduled to run next on Treasure Data

$ td wf schedules

Incremental Processing Workflows

Refer to the following topics if you want to create workflows that process data incrementally:

Feedback

If you have any feedback, we welcome hearing your thoughts on our TD Workflows ideas forum.

Also, if you have any ideas or feedback on the tutorial itself, provide your comments here.


Last modified: Apr 19 2018 23:59:47 UTC

If this article is incorrect or outdated, or omits critical information, let us know. For all other issues, access our support channels.