# Agent Commands Manage LLM agents with YAML/Markdown configuration files. ```bash tdx agent # Full command tdx agents # Shorthand alias for `tdx agent list` ``` ## Overview The `tdx agent` commands provide complete management of LLM agents: | Command | Description | | --- | --- | | [`list`](#list) | List agents in current project | | [`show`](#show) | Show agent details | | [`pull`](#pull) | Export agents from an LLM project to local YAML/Markdown files | | [`push`](#push) | Import local agent files to an LLM project | | [`clone`](#clone) | Clone a project to a new project | | [`test`](#test) | Run automated tests against an agent | | [`create`](#create) | Create a new agent (prefer `pull`/`push` workflow) | | [`update`](#update) | Update an existing agent (prefer `pull`/`push` workflow) | | [`delete`](#delete) | Delete an agent | ## Typical Usage ```bash # 1. Create a new LLM project (or use an existing one) tdx llm project create "My LLM Project" --description "Data analysis agents" # 2. Pull project to local files (auto-sets context) tdx agent pull "My LLM Project" # Creates: agents/my-llm-project/ # 3. Edit YAML/Markdown files locally # - Edit agents/my-llm-project/my-agent/prompt.md for system prompt # - Edit agents/my-llm-project/my-agent/agent.yml for configuration # 4. Preview changes before pushing tdx agent push --dry-run # 5. Push changes to the project tdx agent push # 6. Add tests and run them # - Create agents/my-llm-project/my-agent/test.yml tdx agent test ./agents/my-llm-project/my-agent/ # 7. Clone to another environment tdx agent clone ./agents/my-llm-project/ --name "My Project (Staging)" --profile staging ``` Recommended Workflow Use the **pull/push workflow** for agent development: - Edit agents as YAML/Markdown files in your favorite editor - Version control with Git for history and collaboration - Use `--dry-run` to preview changes before pushing - Add `test.yml` to validate agent behavior with automated tests The `create`/`update` commands are available for quick one-off changes but the file-based workflow is recommended for maintainability. ## Folder Structure When you pull agents from a project, the following folder structure is created: ``` agents/ └── {project-name}/ # Normalized project name (kebab-case) ├── tdx.json # Project configuration ├── {agent-name}/ # Normalized agent name │ ├── prompt.md # System prompt (editable markdown) │ ├── agent.yml # Agent configuration │ ├── starter_message.md # Optional: multiline starter message │ └── test.yml # Optional: automated test definitions ├── knowledge_bases/ # Knowledge base definitions │ ├── {kb-name}.yml # Table-based KB (queries TD database) │ └── {kb-name}.md # Text-based KB (plain text content) ├── prompts/ # Prompt definitions │ └── {prompt-name}.yml └── integrations/ # Chat integration definitions └── {service-type}.yml # e.g., chat_generic.yml ``` Name Collisions If multiple agents have names that normalize to the same folder name (e.g., "My Agent" and "my-agent" both become `my-agent`), the pull operation appends numeric suffixes to avoid conflicts: `my-agent`, `my-agent-2`, `my-agent-3`, etc. The actual agent name is preserved in the `name:` field of `agent.yml`. Integration Support **Safe integrations** (Generic Chat, Agent Console, Parent Segment) are included in pull/push/clone operations. These are stored in `integrations/{service-type}.yml`. **Sensitive integrations** (Slack, Webhook) are **not** synced because they contain secrets (Slack signing secrets, webhook credentials) that should not be version controlled. Configure these manually in the TD console for each environment. ## File Formats ### tdx.json Project configuration file located in the project root folder. ```json { "llm_project": "My LLM Project" } ``` ### agent.yml Agent configuration file with model settings, tools, outputs, and variables. ```yaml name: My Support Agent description: Customer support assistant # Model configuration model: claude-4.5-sonnet temperature: 0.7 max_tool_iterations: 5 reasoning_effort: medium # none, minimal, low, medium, high # Starter message (short inline, or use starter_message.md for long text) starter_message: Hello! How can I help you today? # Output definitions outputs: - name: resolution_status function_name: get_resolution_status function_description: Returns the status of issue resolution json_schema: | {"type": "object", "properties": {"status": {"type": "string"}}} # Tools - give the agent access to knowledge bases, other agents, etc. tools: - type: knowledge_base target: '@ref(type: "knowledge_base", name: "support-kb")' target_function: SEARCH function_name: search_knowledge function_description: Search the support knowledge base - type: agent target: '@ref(type: "agent", name: "escalation-agent")' target_function: CHAT function_name: escalate_issue function_description: Escalate to senior support # Variables (runtime inputs from knowledge base) variables: - name: customer_context target_knowledge_base: '@ref(type: "knowledge_base", name: "customer-kb")' target_function: LOOKUP function_arguments: | {"query": "{{customer_id}}"} ``` #### Tools Configuration Tools give agents access to external resources like knowledge bases, other agents, web search, image generation, and parent segment data. Each tool requires: | Field | Description | | --- | --- | | `type` | Tool type: `knowledge_base`, `agent`, `web_search`, `image_gen`, `parent_segment_kb` | | `target` | Reference to the resource using `@ref(...)` syntax. Not needed for `parent_segment_kb` (singleton). | | `target_function` | Function to call: `SEARCH`, `LOOKUP`, `CHAT`, `TEXT_TO_IMAGE`, etc. | | `function_name` | Name exposed to the LLM (what the agent calls) | | `function_description` | Description shown to the LLM | | `output_mode` | Optional for agent tools: `RETURN` (default, returns complete response) or `SHOW` (shows response to user) | **Knowledge Base Tool** - Search or lookup data from a knowledge base. The `knowledge_base` type works with both table-based and text-based knowledge bases. The system automatically resolves the correct type based on the name: ```yaml tools: # Table-based knowledge base (backed by TD table) - type: knowledge_base target: '@ref(type: "knowledge_base", name: "product-catalog")' target_function: SEARCH # SEARCH, LOOKUP, LIST_COLUMNS function_name: search_products function_description: Search the product catalog for items # Text-based knowledge base (document/text content) - type: knowledge_base target: '@ref(type: "knowledge_base", name: "company-docs")' target_function: READ_TEXT # READ_TEXT for text KBs function_name: read_docs function_description: Read company documentation ``` **Agent Tool** - Call another agent as a sub-agent: ```yaml tools: - type: agent target: '@ref(type: "agent", name: "sql-expert")' target_function: CHAT function_name: run_sql_query function_description: Execute SQL queries using the SQL expert agent output_mode: RETURN # RETURN (default) or SHOW ``` **Web Search Tool** - Search the web for real-time information: ```yaml tools: - type: web_search target: '@ref(type: "web_search_tool", name: "web-search")' target_function: SEARCH function_name: search_web function_description: Search the web for current information ``` **Image Generation Tool** - Generate or modify images: ```yaml tools: - type: image_gen target: '@ref(type: "image_generator", name: "image-gen")' target_function: TEXT_TO_IMAGE # TEXT_TO_IMAGE, OUTPAINT, INPAINT, IMAGE_VARIATION, REMOVE_BACKGROUND function_name: generate_image function_description: Generate an image from a text description ``` **Parent Segment KB Tool** - Access CDP parent segment data (singleton per project, no `target` needed): ```yaml tools: - type: parent_segment_kb target_function: LIST_SEGMENT_FOLDERS # See below for all functions function_name: list_folders function_description: List segment folders in the project ``` Available `target_function` values for `parent_segment_kb`: | Function | Description | | --- | --- | | `LIST_SEGMENT_FOLDERS` | List all segment folders | | `LIST_BY_FOLDER` | List segments in a folder | | `LIST_ATTRIBUTES` | List available attributes | | `LIST_BEHAVIORS` | List available behaviors | | `GET_SEGMENT` | Get segment details | | `GET_JOURNEY` | Get journey details | | `GET_AUDIENCE` | Get audience details | | `GET_QUERY` | Get query details | | `QUERY_DATA_DIRECT` | Query segment data directly | | `QUERY_SEGMENT_ANALYTICS` | Query segment analytics | Singleton Tool Unlike other tools, `parent_segment_kb` is a singleton per project - there's only one per project, so no `target` reference is needed. The tool automatically connects to the project's parent segment knowledge base. #### Variables Configuration Variables inject data from knowledge bases into the agent's context at runtime: ```yaml variables: - name: user_profile # Variable name used in prompt target_knowledge_base: '@ref(type: "knowledge_base", name: "users-kb")' target_function: LOOKUP # LOOKUP for exact match, SEARCH for semantic function_arguments: | {"query": "{{user_id}}"} # Template with runtime values ``` Use variables in `prompt.md` with `{{variable_name}}` syntax. ### prompt.md System prompt file containing the agent's instructions. This is a plain markdown file that can be edited with any text editor. ```markdown You are a helpful customer support agent for our e-commerce platform. ## Your Role - Assist customers with order inquiries - Provide product information - Handle account issues ## Guidelines Always be polite, professional, and empathetic. If you cannot resolve an issue, escalate to a human agent. ``` ### starter_message.md Optional starter message file for multiline starter messages. If the starter message is short, you can include it directly in `agent.yml`. ### knowledge_bases/{name}.yml (Table-based) Table-based knowledge base that queries a Treasure Data database. ```yaml name: Support KB type: database database: customer_data tables: - name: faq td_query: SELECT * FROM faq enable_data: true enable_data_index: true ``` ### knowledge_bases/{name}.md (Text-based) Text-based knowledge base containing plain text content. Uses YAML frontmatter for metadata. ```markdown --- name: Product FAQ --- # Frequently Asked Questions ## What is your return policy? We offer a 30-day return policy for all unused items... ## How do I track my order? You can track your order by logging into your account... ``` Text KB Name If the `name` field is omitted from the frontmatter, the filename (without `.md`) is used as the knowledge base name. Both table-based and text-based knowledge bases can be referenced using the same `@ref` syntax: ```yaml tools: - type: knowledge_base target: '@ref(type: "knowledge_base", name: "Support KB")' # Table-based # ... - type: knowledge_base target: '@ref(type: "knowledge_base", name: "Product FAQ")' # Text-based # ... ``` ### prompts/{name}.yml Prompt template configuration. ```yaml name: greeting-prompt agent: '@ref(type: "agent", name: "support-agent")' system_prompt: | Generate a personalized greeting... template: | Customer Name: {{customer_name}} json_schema_hint: | {"type": "object", "properties": {"customer_name": {"type": "string"}}} ``` ### integrations/{service-type}.yml Chat integration configuration for Generic Chat, Agent Console, or Parent Segment integrations. ```yaml service_type: chat_generic name: generic-chat-integration chat_welcome_message: "Welcome! How can I help you today?" chat_ignore_managed_actions: false actions: - prompt: '@ref(type: "prompt", name: "support-prompt")' chat_widget_type: button chat_widget_label: Ask Support ui_tags: - "lang:en" - "lang:ja" ``` | Field | Description | | --- | --- | | `service_type` | Integration type: `chat_generic`, `chat_agent_console`, or `chat_parent_segment` | | `name` | Display name for the integration | | `chat_welcome_message` | Welcome message shown when chat starts | | `chat_ignore_managed_actions` | Whether to ignore managed actions | | `actions` | List of actions/buttons shown in the chat widget | Supported Integration Types Only **safe** integration types are synced: - `chat_generic` - Generic chat widget - `chat_agent_console` - Agent console integration - `chat_parent_segment` - Parent segment integration `webhook` and `slack` integrations contain secrets and must be configured manually in the TD console. ## Reference Syntax Use `@ref(...)` to reference other resources by name: ```yaml # Reference a knowledge base target: '@ref(type: "knowledge_base", name: "my-kb")' # Reference another agent target: '@ref(type: "agent", name: "my-agent")' # Reference a prompt prompt: '@ref(type: "prompt", name: "my-prompt")' ``` This allows resources to be referenced by name instead of UUID, making configurations portable across environments. ## Commands ### list List agents in the current project. ```bash tdx agent list [pattern] tdx agents [pattern] # Shorthand alias ``` **Options:** - `-w, --web`: Show console URLs for each agent **Examples:** ```bash # List all agents in current project (uses llm_project context) tdx agent list tdx agents # Same as above # Filter agents by pattern tdx agent list "support-*" tdx agents "support-*" # Same as above # List agents in a specific project tdx agent list "my-project/support-*" # Show with console URLs tdx agent list -w tdx agents -w # Same as above ``` ### show Show detailed information about a specific agent. ```bash tdx agent show ``` **Examples:** ```bash # Show agent details tdx agent show "Support Agent" # Show agent in JSON format tdx agent show "Support Agent" --format json ``` ### create Create a new agent in the current project. Prefer pull/push Workflow For complex agents with tools, knowledge bases, or long prompts, use the **pull/push workflow** instead: 1. Create a minimal agent with this command 2. Run `tdx agent pull` to export to YAML/Markdown 3. Edit files locally and `tdx agent push` to update ```bash tdx agent create [options] ``` **Options:** - `--system-prompt `: System prompt/instructions - `--model `: Model type (default: `claude-4.5-sonnet`) - `--starter-message `: Initial greeting message - `--max-tool-iterations `: Maximum tool iterations (default: 4) - `--temperature `: Temperature 0.0-2.0 (default: 0.7) **Examples:** ```bash # Create basic agent tdx agent create "My Agent" # Create with system prompt tdx agent create "SQL Expert" \ --system-prompt "You are an expert in SQL and data analysis." # Create with all options tdx agent create "Data Analyst" \ --system-prompt "Help users analyze data." \ --model "claude-4.5-sonnet" \ --starter-message "Hello! I can help you analyze your data." \ --max-tool-iterations 8 \ --temperature 0.5 # Create agent in a specific project tdx agent create "MyProject/My Agent" ``` ### update Update an existing agent. Prefer pull/push Workflow For significant changes, use the **pull/push workflow** instead: ```bash tdx agent pull "My Project" # Export to YAML/Markdown # Edit files locally tdx agent push # Push changes ``` This provides better version control and allows editing complex prompts in your favorite editor. ```bash tdx agent update [options] ``` **Options:** - `--name `: New agent name - `--prompt `: Agent prompt/instructions - `--description `: Agent description - `--starter-message `: Starter message **Examples:** ```bash # Update agent name tdx agent update "Old Name" --name "New Name" # Update system prompt tdx agent update "My Agent" --prompt "Updated instructions..." # Update multiple fields tdx agent update "My Agent" \ --description "Updated description" \ --starter-message "New greeting!" ``` ### delete Delete an agent. ```bash tdx agent delete ``` **Examples:** ```bash # Delete an agent tdx agent delete "My Agent" ``` ### pull Pull agents and resources from an LLM project to local files. Shows a YAML diff preview and asks for confirmation before writing files. ```bash # Pull from current directory (uses tdx.json or context) tdx agent pull # Pull all resources from a project tdx agent pull [options] # Pull from an existing local directory tdx agent pull [options] # Pull a specific agent tdx agent pull [options] ``` **Options:** - `-o, --output `: Output directory (default: `agents/{project-name}/`) - `--dry-run`: Preview changes without writing files - `-f, --force`: Overwrite local changes without confirmation - `-y, --yes`: Skip confirmation prompts **Examples:** ```bash # Pull from current directory (if inside a project folder with tdx.json) tdx agent pull # Pull entire project tdx agent pull "My LLM Project" # Pull to specific directory tdx agent pull "My LLM Project" -o ./my-agents # Pull from existing local directory tdx agent pull ./agents/my-project/ # Pull specific agent tdx agent pull "My LLM Project" "Support Agent" # Preview what would be pulled (no files written) tdx agent pull "My LLM Project" --dry-run # Pull without confirmation prompt tdx agent pull "My LLM Project" --yes ``` **Output:** ``` Pull summary for 'My LLM Project': + 4 new | ~ 3 changed | = 13 unchanged Agents: 1 new, 2 updated, 5 unchanged Knowledge Bases: 1 new, 0 updated, 2 unchanged Text Knowledge Bases: 1 new, 0 updated, 1 unchanged Prompts: 0 new, 1 updated, 3 unchanged Integrations: 1 new, 0 updated, 1 unchanged Target: agents/my-llm-project/ Changes to agent 'Support Agent': ──────────────────────────────────────────────────────────────── - temperature: 0.7 + temperature: 0.5 ──────────────────────────────────────────────────────────────── Write 7 files? [y/N] ``` ### push Push local agent files to an LLM project. Shows a YAML diff preview comparing local vs remote, asks for confirmation, and displays the console URL after push. ```bash # Push all resources from current directory tdx agent push [path] [options] # Push from specific directory tdx agent push ./agents/my-project/ [options] # Push specific agent tdx agent push ./agents/my-project/support-agent/ [options] ``` **Options:** - `--dry-run`: Preview changes without pushing - `-f, --force`: Push without confirmation - `-y, --yes`: Skip confirmation prompts **Examples:** ```bash # Push from current directory tdx agent push # Push from specific directory tdx agent push ./agents/my-project/ # Push specific agent tdx agent push ./agents/my-project/support-agent/ # Preview push changes (no changes made) tdx agent push --dry-run # Push without confirmation prompt tdx agent push --yes ``` **Output:** ``` Push summary for 'My LLM Project': + 2 new | ~ 3 changed | = 15 unchanged Agents: 0 new, 2 updated, 8 unchanged Knowledge Bases: 1 new, 0 updated, 2 unchanged Text Knowledge Bases: 1 new, 0 updated, 1 unchanged Prompts: 0 new, 0 updated, 2 unchanged Integrations: 0 new, 1 updated, 1 unchanged Source: agents/my-llm-project/ Changes to agent 'Support Agent': ──────────────────────────────────────────────────────────────── - temperature: 0.5 + temperature: 0.7 ──────────────────────────────────────────────────────────────── Push 5 resources? [y/N] ✔ Pushed 5 resources to 'My LLM Project' Project: https://console-next.us01.treasuredata.com/app/af/12345/ag ``` When pushing a single agent, the chat URL is shown: ``` ✔ Agent updated successfully Agent: Support Agent Chat: https://console-next.us01.treasuredata.com/app/af/12345/ag/67890/tc ``` ### clone Clone an LLM project to create a new project with all agents, knowledge bases, and prompts. ```bash # Clone from remote project tdx agent clone --name # Clone from local directory tdx agent clone --name # Clone from context (llm_project) tdx agent clone --name ``` **Options:** - `-n, --name `: Name for the new project (required) - `--dry-run`: Preview what would be cloned without making changes - `-y, --yes`: Skip confirmation prompts - `--profile `: Target profile for the new project (for cross-profile cloning) **Examples:** ```bash # Clone a project to a new name tdx agent clone "Production Agents" --name "Staging Agents" # Clone from local files (useful for cross-profile deployment) tdx agent clone ./agents/my-project/ --name "New Project" --profile production # Preview what would be cloned tdx agent clone "My Project" --name "My Project Copy" --dry-run # Clone without confirmation tdx agent clone "My Project" --name "My Project Copy" --yes ``` **Output:** ``` Clone "Production Agents" to new project "Staging Agents"? [y/N] y ✔ Project cloned successfully Source: Production Agents New project: Staging Agents New project ID: 12345 Summary: Agents: 5 created Knowledge Bases: 2 created Text Knowledge Bases: 1 created Prompts: 3 created Integrations: 1 created Context set: llm_project = Staging Agents Project: https://console-next.us01.treasuredata.com/app/af/12345/ag ``` Cross-Profile Cloning To clone a project to a different profile (e.g., from default to production): 1. First pull the project locally: `tdx agent pull "MyProject"` 2. Then clone from local files: `tdx agent clone ./agents/my-project/ --name "MyProject-Prod" --profile production` Integration Support in Clone **Safe integrations** (Generic Chat, Agent Console, Parent Segment) are included in clone operations. **Sensitive integrations** (Slack, Webhook) are **not** cloned because they contain secrets. Configure these manually in the TD console for each environment. ### test Run automated tests against an agent using YAML test definitions. Tests are evaluated by a judge agent (Claude Sonnet 4.5) for binary pass/fail results. ```bash # Run tests from current agent directory tdx agent test # Run tests from a specific path tdx agent test # Run tests from test.yml file tdx agent test ./agents/my-project/my-agent/test.yml ``` **Options:** - `-n, --name `: Filter to specific test(s) by name (can be repeated) - `--tags `: Filter to tests with specific tags (comma-separated) - `--dry-run`: Parse and validate test definitions without running - `--no-eval`: Run conversations without evaluation (useful for debugging) - `--reeval`: Re-evaluate last test run with updated criteria (skip conversation generation) **Examples:** ```bash # Run all tests in current agent directory tdx agent test # Run tests from specific agent tdx agent test ./agents/my-project/my-agent/ # Run only specific tests tdx agent test --name "greeting_test" --name "context_test" # Run tests with specific tags tdx agent test --tags "smoke" tdx agent test --tags "smoke,regression" # Validate test file without running tdx agent test --dry-run # Run without evaluation (just execute conversations) tdx agent test --no-eval # Re-evaluate last test run with updated criteria tdx agent test --reeval # Re-evaluate specific tests only tdx agent test --reeval --name "greeting_test" ``` **Output:** ``` Running tests for my-agent... Setting up evaluator agent... Test 1/3: greeting_test Round 1: Hello → ✓ PASS Test 2/3: calculation_test Round 1: What is 2+2? → ✓ PASS Test 3/3: context_test Round 1: My name is Alice → ✓ PASS Round 2: What's my name? → ✓ PASS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Results: 3 passed, 0 failed Duration: 12.3s ``` ### Test File Format Create a `test.yml` file in your agent directory alongside `agent.yml`: ``` agents/ └── my-project/ └── my-agent/ ├── agent.yml ├── prompt.md └── test.yml # Test definitions ``` #### Flat Format (Single-Round Tests) For simple tests with a single user input and criteria: ```yaml tests: - name: greeting_test tags: [smoke, core] user_input: Hello criteria: Should respond with a friendly greeting - name: calculation_test tags: [regression] user_input: What is 2 + 2? criteria: Should respond with the correct answer (4) - name: help_request user_input: How do I reset my password? criteria: Should provide clear password reset instructions ``` The `tags` field is optional and can be used to categorize tests for filtering with `--tags`. #### Rounds Format (Multi-Round Tests) For tests requiring multiple conversation turns: ```yaml tests: - name: context_memory_test tags: [memory, core] rounds: - user_input: My name is Alice criteria: Should acknowledge the name - user_input: What's my name? criteria: Should remember and respond with "Alice" - name: multi_step_task tags: [workflow, regression] rounds: - user_input: I want to analyze sales data criteria: Should ask clarifying questions about the data - user_input: It's in the sales_2024 table criteria: Should acknowledge and proceed with analysis - user_input: Show me top products criteria: Should provide a list of top-selling products ``` #### Mixed Format You can combine both formats in the same file: ```yaml tests: # Simple single-round tests - name: basic_greeting user_input: Hi there! criteria: Should greet back politely - name: simple_question user_input: What time is it? criteria: Should explain it cannot tell time or ask for context # Complex multi-round test - name: data_analysis_workflow rounds: - user_input: I need help with customer segmentation criteria: Should ask about the data source and segmentation goals - user_input: Use the customers table, segment by purchase frequency criteria: Should propose a segmentation approach - user_input: Looks good, proceed criteria: Should execute the segmentation and show results ``` ### Writing Good Criteria Criteria should be clear and specific about what constitutes a pass: ```yaml # ✓ Good - specific and measurable criteria: Should respond with the number 4 # ✓ Good - describes expected behavior criteria: Should ask for the customer's email address before proceeding # ✓ Good - includes negative constraints criteria: Should provide help without mentioning competitor products # ✗ Bad - too vague criteria: Should give a good response # ✗ Bad - subjective criteria: Should be helpful and friendly ``` ### How Evaluation Works 1. **Test Execution**: Each round sends the `user_input` to your agent and waits for a response 2. **History Retrieval**: After all rounds complete, the full conversation history is fetched from the API 3. **Evaluation**: For each round, the judge agent (Claude Sonnet 4.5) evaluates whether the agent's response meets the `criteria` 4. **Results**: Each round gets a PASS or FAIL result with a reason The evaluator agent is automatically created in your default project (`tdx_default_`) the first time you run tests. ### Re-evaluating Tests When iterating on evaluation criteria, you can skip conversation generation and re-evaluate the last test run: 1. **Initial Run**: Execute tests normally to generate conversations ```bash tdx agent test ``` 2. **Edit Criteria**: Modify `test.yml` to refine your criteria 3. **Re-evaluate**: Run with `--reeval` to test new criteria against cached conversations ```bash tdx agent test --reeval ``` The cached conversations are stored in `.cache/tdx/last_agent_test_run.json` relative to your working directory. The cache includes: - Chat IDs for conversation URLs - Agent responses for each round - Original test metadata **Handling new tests:** If you add new tests to `test.yml` that weren't in the cached run, they will be executed normally (generating new conversations) while existing tests use the cache. Criteria Development Workflow Use `--reeval` to rapidly iterate on criteria without waiting for new conversations. Once criteria are stable, run a fresh test to verify end-to-end behavior. ## Workflow ### Initial Setup 1. Pull a project to create local files: ```bash tdx agent pull "My LLM Project" ``` 2. This creates the folder structure with all agents and resources ### Making Changes 1. Edit the YAML/Markdown files with your favorite editor 2. Preview changes: ```bash tdx agent push --dry-run ``` 3. Push changes: ```bash tdx agent push ``` ### Testing Agents 1. Create a `test.yml` file in your agent directory: ```yaml tests: - name: basic_functionality user_input: Hello, can you help me? criteria: Should respond with a helpful greeting ``` 2. Run tests: ```bash tdx agent test ``` 3. Review results and iterate on your agent's prompt until tests pass ### Version Control The YAML/Markdown format is designed for version control: - Human-readable diffs - Easy code reviews - Branch and merge workflows - Git history for audit trails - Test definitions versioned alongside agent code ## Comparison with Legacy Commands | Feature | `agent pull/push/clone` | `llm project backup/restore` (deprecated) | | --- | --- | --- | | Format | YAML/Markdown | JSON | | Human-editable | Yes | No | | Git-friendly | Yes | No | | Selective sync | Yes (single agent) | No (full project) | | Cross-profile clone | Yes | No | | Safe integrations | Included (Generic Chat, etc.) | Included | | Sensitive integrations | Not included (Slack, Webhook) | Included | | Use case | Development workflow | Legacy disaster recovery | Deprecated Commands The `tdx llm project backup/restore` commands are deprecated. Use: - `tdx agent pull` instead of `backup` - `tdx agent push` instead of `restore` - `tdx agent clone` for copying projects