In TD Workflow, you can have tasks run in parallel. By default, tasks are run sequentially. To have TD Workflow tasks run in parallel, you must specify the parallel parameter as _parallel: True. It is also recommended that you use the +group syntax to group the tasks that you want to have run in parallel.

You can define as many tasks as you want to run in parallel, however, TD can only run up to 10 separate processing threads at a given time. Tasks can have one of four different states, only one of which is Running. As long as only ten tasks have concurrent states of running each of those tasks is executed in parallel.

In this topic:

Workflow _parallel option supports entering the number of tasks to run in parallel.

With the option of _parallel : true there is no limit that users can specify on the number of tasks to run in parallel, which could potentially result in tasks that have many sub-tasks all run in parallel blocking the tasks queues.

The enhancement gives the user the control to manage running parallel tasks & better manage the resource utilization.

Parallel Task Order Example

The following example groups two tasks and has them running in parallel.

+group:
  _parallel: True
  +step1:
    echo>: "hello!"
  +step2:
    py>: tasks.printMessage
    docker:
      image: "digdag/digdagpython:3.6.8stretch"
  +step3: 
    td_run>: my_other_saved_query

With the current option of _parallel : true there is no limit that users can specify the number of tasks to run in parallel, which might potentially result in tasks that have many sub-tasks all running in parallel and blocking the tasks queues.

A workflow _parallel option supports entering the number of tasks to run in parallel.

+group:
  _parallel:
    limit: 2
  +step1:
    echo>: "hello!"
  +step2:
    py>: tasks.printMessage
    docker:
      image: "digdag/digdagpython:3.6.8stretch"
  +step3: 
    td_run>: my_other_saved_query

Default Task Order Example

For example by default, in the following workflow, task step1 is executed first and because there is a custom script included in the step, the Python program tasks.printMessage runs in a separate Docker container, and when it completes, the subsequent task in step 2 executes.

+step1:
 echo>: "hello!"
+step2:
  py>: tasks.printMessage
  docker:
    image: "digdag/digdagpython:3.6.8stretch"
+step3:
  td_run>: my_other_saved_query

TD Workflow runs until all tasks complete. Long-running tasks that go beyond the set TTL (Time To Live) are timed out. Even if the task is executing a custom script. Periodically the task is polled to check on the status of the script running in the container and log messages can be seen on the TD Console and CLI using the “td wf log <attemptid>” command.


  • No labels