Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this article, we create an error in a workflow to guide you through the process of troubleshooting a workflow that you’ve submitted to Treasure Data.


Table of Contents

Prerequisites

Introductory Tutorial

If you haven’t already, start by going through the TD Workflows Introductory Tutorial.

You will download and use the workflow project in the tutorial.

Create an error to debug

Navigate to the `nasdaq_analysis` directory from the introductory tutorial.

...

Code Block
linenumberstrue
2016-05-11 16:40:24 +0900: Digdag v0.6.1
Session attempts:
  attempt id: 100
  uuid: ef704e1f-3eb5-4ba7-9be0-4ebfaeee4424
  project: nasdaq_analysis
  workflow: nasdaq_analysis
  session time: 2016-05-11 07:38:15 +0000
  retry attempt name:
  params: {"td":{"apikey":"..."},"last_session_time":"2016-05-11T00:00:00+00:00","next_session_time":"2016-05-12T00:00:00+00:00"}
  created at: 2016-05-11 16:38:17 +0900
  kill requested: false
  status: error

Troubleshooting

Determine what tasks failed

In above example, attempt_id = 100.

...

You can see under the last task listed, named +nasdaq_analysis+task2 that state: error, meaning this task is the one that failed.

Review logs of the failed task

The command to get the logs for a particular tasks is as follows:

...

  • You can also use the job id to review error logs in TD Console.

Fix the query

Fix the query and rerun the workflow.

Code Block
linenumberstrue
$ cat > queries/monthly_open.sql <<EOF
SELECT TD_DATE_TRUNC('month', time), AVG(daily_avg_open) AS
monthly_avg_open, AVG(daily_avg_close) AS month_avg_close
FROM daily_open
GROUP BY 1
EOF

Push the fix to Treasure Data

Code Block
linenumberstrue
$ td wf push nasdaq_analysis

Retry the workflow session

Rerun the workflow.

Code Block
linenumberstrue
$ td wf retry <attempt_id> --name fix-typo --latest-revision --all

...

The most recent attempt has the same session time as the previous attempt that failed. This is the benefit of using retry in this instance, instead of start. This is particularly important if you have a daily scheduled workflow, and you only want to retry the current day’s session using any time-related parameters embedded into the workflow.

Alternatively, you can use `--resume` to only rerun starting at the failed task and all subsequent tasks.