# Scheduling Using the TD Toolbelt Treasure Data has a scheduler feature that supports periodic query execution. ## Prerequisites * Basic knowledge of Treasure Data, including [TD Toolbelt](/tools/cli-and-sdks/td-toolbelt). * A table with some data. ## Create a New Schedule Using TD Toolbelt A new schedule can be created using the td sched:create command. The name of the schedule, cron-style schedule, query, and the database name are required. ```bash td sched:create [sql] ``` ```bash options: -d, --database DB_NAME use the database (required) -t, --timezone TZ name of the timezone. Only extended timezones are supported. -D, --delay SECONDS delay time of the schedule -r, --result RESULT_URL write result to the URL (see also result:create subcommand) -u, --user NAME set user name for the result URL -p, --password ask password for the result URL -P, --priority PRIORITY set priority -q, --query PATH use file instead of inline query -R, --retry COUNT automatic retrying count -T, --type TYPE set query type (hive) ``` **For example:** ![](/assets/scheduling-using-the-td-toolbelt-2024-02-13.420ebe5b77f51fdf190b0e6fa85b4a3ab5d9d7f9881bea6d9c0badcaa944e5e4.b40d6bf5.png) ## Setting the Priority of Scheduled Jobs using TD Toolbelt To change the priority of a job, use: `td sched:update -P <#>` | **Priority** | **syntax option value** | | --- | --- | | very high | -P 2 | | normal | -P 0 | | very low | -P -2 | For example: `td sched:update 111_beta_query_01 -P 2` ## Example: Periodic Aggregation A common pattern is to periodically summarize data from a main table into another table. The example aggregates web request results from an access log on an hourly basis. It makes use of several common functions for Presto (or for Hive) to set the proper time range for the aggregation. ```sql SELECT USER, code, method, PATH, agent, HOST, AVG(SIZE) FROM www_access WHERE TD_TIME_RANGE(time, TD_TIME_ADD(TD_SCHEDULED_TIME(), '-1h'), TD_SCHEDULED_TIME()) GROUP BY USER, code, method, PATH, agent, HOST ``` To finish setting up the query to run on an hourly basis, create a scheduled query : ```bash $ td sched:create \ hourly_agg \ "@hourly" \ -d testdb \ -D 1800 \ "SELECT USER, code, method, PATH, agent, HOST, AVG(SIZE) FROM www_access WHERE TD_TIME_RANGE(time, TD_TIME_ADD(TD_SCHEDULED_TIME(), '-1h'), TD_SCHEDULED_TIME()) GROUP BY USER, code, method, PATH, agent, HOST" ``` In the preceding example, @hourly is the same as the cron `0 * * * *`. The -D parameter allows you specify a delay time in seconds. In the example, a 1800 seconds (or 30 minutes delay) is set. The job will execute 30 minutes after each hour. ## Viewing Schedules using TD Toolbelt To see the list of existing scheduled entries, use td sched:list. ## Viewing the History of a Job td sched:history shows the job history of a schedule entry. To investigate the results of each individual job, use td job ``. To get a list of jobs that run according to a defined schedule, use: `td sched:list` To view the history of one of the jobs with a defined schedule, use: `td sched:history [max]` **Example:** ![](/assets/scheduling-using-the-td-toolbelt-2024-02-13-1.eb750a8970fbbd75914f0a00e2946e28c22fa96dcb824bba2ec1cc67dc63bea6.b40d6bf5.png) ## Update the Schedule Most of the settings for scheduled queries can be updated or reconfigured at any time. The ‘next scheduled run time’ (also referred to as ‘Next’ or ‘Next schedule’) is an inferred configuration that is auto updated based on the current time of your browser's timezone, cron schedule, and delay setting. When a scheduled query is updated, the updated settings will apply immediately to the next scheduled run, occurring at ‘next scheduled run time + delay’ (with delay >0, <0 or 0). `td sched:update ` Example: `td sched:update sched1 -s "0 */2 * * *" -d my_db -t "Asia/Tokyo" -D 3600` options: ``` -n, --newname NAME change the schedule's name -s, --schedule CRON change the schedule -q, --query SQL change the query -d, --database DB_NAME change the database -r, --result RESULT_URL change the result target (see also result:create subcommand) -t, --timezone TZ name of the timezone. Only extended timezones are supported. -D, --delay SECONDS change the delay time of the schedule -P, --priority PRIORITY set priority -R, --retry COUNT automatic retrying count -T, --type TYPE set query type (hive) ``` ## Unschedule the Job To unschedule a specific job, use: `"" instead of "*****"` For example: `$ td sched:update sched1 -s "" -d my_db -t "Asia/Tokyo" -D 3600` or `$ td sched:create non_scheduled_query "" "select count(*) from www_access" -d sample_datasets` ## Delete the Schedule To remove the schedule from a specific job, use: `td sched:delete ` For example: `$ td sched:delete hourly_count`