# Chapter 13: Workflows

## The Problem

You've built segments, journeys, and queries. But someone still needs to run the daily data refresh. Someone needs to execute the weekly report. Someone needs to coordinate the data pipeline that feeds your segments.

That someone shouldn't be you clicking buttons every morning.

## The Key Idea

Core concept
Workflows automate recurring tasks. AI writes workflow definitions that run on schedule without human intervention.

Think of workflows as the invisible machinery that keeps your marketing operations running—data flowing, tables updating, reports generating.

## What Workflows Do

Common marketing workflows:

| Task | Frequency | What Happens |
|  --- | --- | --- |
| Update customer metrics | Daily | Run SQL to calculate latest values |
| Prepare segment data | Daily | Transform raw data into segment-ready tables |
| Generate reports | Weekly | Calculate metrics, create summary tables |
| Archive old data | Monthly | Move historical records |


## The Workflow Pattern

Just like segments and journeys, workflows follow pull → edit → push:


```mermaid
flowchart LR
    A[Pull] --> B[Edit]
    B --> C[Push]
    C --> D[Run]
```


```bash
# Pull existing workflow project
tdx wf pull myproject

# Edit locally, then push changes
tdx wf push

# Run the workflow manually
tdx wf run myproject
```

## Creating Your First Workflow

Tell AI what you need automated:


```
> "Create a workflow that updates customer metrics every morning at 6am"
```

AI generates a workflow file (`workflows/daily-metrics/daily-metrics.dig`):


```yaml
timezone: America/New_York

schedule:
  daily>: 06:00:00

_export:
  td:
    database: marketing
    engine: presto

+update_customer_metrics:
  td>: queries/update_customer_metrics.sql
  create_table: customer_metrics
```

## Understanding the Workflow


```yaml
timezone: America/New_York
```

All times interpreted in this timezone.


```yaml
schedule:
  daily>: 06:00:00
```

Run every day at 6:00 AM.


```yaml
_export:
  td:
    database: marketing
    engine: presto
```

Default settings for all `td>` tasks in this workflow.


```yaml
+update_customer_metrics:
  td>: queries/update_customer_metrics.sql
  create_table: customer_metrics
```

A task that runs a SQL file and creates/replaces a table with the results.

## The td> Operator

The `td>` operator runs SQL queries. It's the core of TD workflows:


```yaml
# Run SQL from a file
+from_file:
  td>: queries/analysis.sql
  create_table: results

# Run inline SQL
+inline_query:
  td>:
    query: |
      SELECT
        customer_id,
        SUM(amount) as total_spent
      FROM orders
      WHERE TD_TIME_RANGE(time, '${session_date}', NULL)
      GROUP BY 1
  insert_into: daily_totals
```

**Output options:**

- `create_table:` — Replace table with results
- `insert_into:` — Append results to existing table
- `store_last_results: true` — Store results for conditional logic


## Task Sequences

Tasks run in order. Later tasks wait for earlier ones:


```yaml
+step_1_extract:
  td>: queries/extract_orders.sql
  create_table: orders_staging

+step_2_transform:
  td>: queries/calculate_metrics.sql
  create_table: customer_metrics

+step_3_aggregate:
  td>: queries/build_summary.sql
  create_table: daily_summary
```

Step 2 waits for step 1. Step 3 waits for step 2.

## Parallel Tasks

Run independent tasks simultaneously:


```yaml
+calculate_metrics:
  _parallel: true

  +revenue_metrics:
    td>: queries/revenue_metrics.sql
    create_table: revenue_metrics

  +engagement_metrics:
    td>: queries/engagement_metrics.sql
    create_table: engagement_metrics

  +product_metrics:
    td>: queries/product_metrics.sql
    create_table: product_metrics

+combine_results:
  td>: queries/combine_metrics.sql
  create_table: all_metrics
```

All three metrics queries run at the same time, then `combine_results` runs after.

## Scheduling Options

**Daily:**


```yaml
schedule:
  daily>: 06:00:00
```

**Hourly:**


```yaml
schedule:
  hourly>: 00:00
```

**Weekly:**


```yaml
schedule:
  weekly>: Mon,06:00:00
```

**Multiple times per day:**


```yaml
schedule:
  cron>: 0 6,12,18 * * *
```

**Specific days:**


```yaml
schedule:
  cron>: 0 9 1,15 * *  # 1st and 15th of each month
```

## Session Variables

Built-in variables for dates:


```yaml
+daily_report:
  td>:
    query: |
      SELECT *
      FROM events
      WHERE TD_TIME_RANGE(time, '${session_date}', TD_TIME_ADD('${session_date}', '1d'))
  create_table: daily_events_${session_date_compact}
```

| Variable | Example Value | Use Case |
|  --- | --- | --- |
| `${session_date}` | 2024-12-15 | Date filtering |
| `${session_time}` | 2024-12-15T06:00:00+00:00 | Exact timestamp |
| `${session_date_compact}` | 20241215 | Table naming |
| `${last_session_date}` | 2024-12-14 | Previous run date |


## Variables and Parameters

Make workflows reusable:


```yaml
_export:
  td:
    database: marketing
  target_days: 30

+recent_customers:
  td>:
    query: |
      SELECT *
      FROM customers
      WHERE last_purchase_date >= DATE_ADD('day', -${target_days}, CURRENT_DATE)
  create_table: recent_customers
```

Run with different parameters:


```bash
tdx wf run daily-metrics --param target_days=90
```

## Error Handling

What happens when something fails?


```yaml
+critical_update:
  td>: queries/update_metrics.sql
  create_table: customer_metrics
  _retry: 3

  _error:
    +log_failure:
      td>:
        query: |
          INSERT INTO workflow_errors
          VALUES ('${session_id}', 'critical_update failed', CURRENT_TIMESTAMP)
```

If `critical_update` fails after 3 retries, the `_error` block runs.

## Retry Logic

Automatically retry transient failures:


```yaml
+api_dependent_query:
  _retry: 3
  _retry_interval: 60

  td>: queries/external_data.sql
  create_table: external_metrics
```

Try up to 3 times, waiting 60 seconds between attempts.

**Exponential backoff:**


```yaml
+flaky_task:
  _retry:
    limit: 5
    interval: 30
    interval_type: exponential

  td>: queries/large_query.sql
```

## Conditional Execution

Run tasks based on query results:


```yaml
+check_data:
  td>:
    query: |
      SELECT COUNT(*) as cnt FROM new_orders WHERE date = '${session_date}'
  store_last_results: true

+process_if_data:
  if>: ${td.last_results.cnt > 0}
  _do:
    +process_orders:
      td>: queries/process_orders.sql
      create_table: processed_orders
```

## A Complete Marketing Workflow

Here's a realistic daily workflow:


```yaml
timezone: America/New_York

schedule:
  daily>: 05:00:00

_export:
  td:
    database: marketing
    engine: presto

# Step 1: Update base metrics
+update_metrics:
  td>: queries/update_customer_metrics.sql
  create_table: customer_metrics

# Step 2: Prepare segment data (parallel)
+prepare_segments:
  _parallel: true

  +high_value_data:
    td>: queries/high_value_customers.sql
    create_table: segment_high_value

  +recent_buyers_data:
    td>: queries/recent_buyers.sql
    create_table: segment_recent_buyers

  +at_risk_data:
    td>: queries/at_risk_customers.sql
    create_table: segment_at_risk

# Step 3: Build summary report
+daily_summary:
  td>:
    query: |
      SELECT
        '${session_date}' as report_date,
        (SELECT COUNT(*) FROM segment_high_value) as high_value_count,
        (SELECT COUNT(*) FROM segment_recent_buyers) as recent_buyers_count,
        (SELECT COUNT(*) FROM segment_at_risk) as at_risk_count,
        CURRENT_TIMESTAMP as generated_at
  insert_into: daily_reports

# Global error handler
_error:
  +log_error:
    td>:
      query: |
        INSERT INTO workflow_errors
        VALUES ('daily-marketing', '${session_id}', '${task_name}', CURRENT_TIMESTAMP)
```

## Managing Workflows

**List workflow projects:**


```bash
tdx wf projects
```

**Pull a project to edit locally:**


```bash
tdx wf pull daily-metrics
# Creates: workflows/daily-metrics/
```

**Push changes back:**


```bash
tdx wf push daily-metrics
```

**View sessions and task status:**


```bash
tdx wf sessions daily-metrics
tdx wf attempt <attempt-id> tasks
```

**View logs for a specific task:**


```bash
tdx wf attempt <attempt-id> logs +update_metrics
```

**Run manually:**


```bash
tdx wf run daily-metrics
```

## Mental Model: Assembly Line

Think of workflows as an assembly line:


```
[Raw Data] → [Extract] → [Transform] → [Load] → [Ready for Segments]
```

Each station does one job. The line runs automatically on schedule. If a station breaks, the line stops and logs the error.

## Pitfalls

**"The workflow didn't run."**

Check the schedule and timezone:


```
> "When is daily-metrics scheduled to run next?"
```

**"A task is taking too long."**

Large queries might need optimization. Check execution time:


```bash
tdx wf attempt <attempt-id> tasks
```

**"The query failed with a timeout."**

Add retry logic and consider breaking into smaller queries:


```yaml
+large_query:
  _retry: 2
  td>: queries/optimized_query.sql
```

## What You've Learned

- Workflows automate recurring SQL tasks
- The `td>` operator runs queries and creates tables
- Tasks run sequentially or in parallel with `_parallel: true`
- Schedules can be daily, hourly, weekly, or custom cron
- Session variables like `${session_date}` make queries dynamic
- Error handling with `_error` blocks prevents silent failures
- Retry logic handles transient issues


## Next Step

Your data pipelines run automatically. [Chapter 14](/treasure-code/book/14-agents) shows you how to build AI agents—giving your team members AI-powered assistants that can answer questions about customer data without needing Claude Code.

*Your workflows are automated. Next, you'll build AI agents for your team.*