Skip to main content
You can embed dynamic values from agents, test cases, and simulations into your metric prompts and test scenarios using template variables. The system automatically replaces these placeholders with actual values during evaluation.

Supported Sources

The template system supports these sources:
  • {{agent.*}} - References agent attributes
  • {{test_case.*}} - References test case attributes
  • {{test_case.expected_behaviors}} - References test case criteria (used by Composite Evaluation)
  • {{customer_metadata.*}} - References metadata you provide at conversation upload time (see Customer Metadata below)

Basic Usage

The simplest form is accessing a top-level attribute:
{{agent.attribute_name}}
{{test_case.attribute_name}}
{{test_case.expected_behaviors}} //For criteria in your test case (used by Composite Evaluation)
Example: Imagine one agent has the attribute location with value “San Francisco”, and another agent has value “London”.
Scenario: You are a user calling for travel recommendations in {{agent.location}}
Criterion: The agent should only give travel recommendations in {{agent.location}}

Customer Metadata (Upload-Time Values)

When you submit a production call for conversation monitoring, you can attach arbitrary metadata to that specific conversation. Those values are stored with the conversation and can be referenced in any LLM judge metric prompt using {{customer_metadata.<key>}}. The value is substituted per conversation at evaluation time, so the same metric can be judged against a different ground-truth value for each uploaded call. This is the recommended way to evaluate uploaded conversations against facts that are only known at upload time — order totals, prices, account balances, confirmation numbers, expected outcomes, and similar per-call ground truth. 1. Provide the value when you upload the conversation (POST /v1/conversations:submit):
{
  "transcript": [
    { "role": "user", "content": "How much is a gallon of whole milk today?" },
    { "role": "assistant", "content": "A gallon of whole milk is $2.50 today." }
  ],
  "metadata": { "milk_price": "$2.50" },
  "metrics": ["<your-metric-id>"]
}
You can also add metadata key/value pairs in the Upload to Monitoring dialog in the app, or with coval conversations submit --metadata milk_price='$2.50'. 2. Reference the value in your metric prompt:
The verified price of a gallon of whole milk for this conversation is
{{customer_metadata.milk_price}}. Answer YES only if the price the agent
quoted matches {{customer_metadata.milk_price}}; otherwise answer NO.
For the example above the metric resolves to “…the verified price… is 2.50..." and returns **YES**. Uploading a different call with `"metadata": { "milk_price": "3.50” }` — even with the identical transcript — resolves to “3.50"andreturnsNO,becausetheagentquoted3.50" and returns **NO**, because the agent quoted 2.50. The upload-time value, not the transcript alone, drives the judgment.
If a metric prompt references a {{customer_metadata.<key>}} that a conversation did not supply, that metric fails for that conversation with a clear error. Make sure every conversation you attach the metric to includes the key, or scope the metric to the conversations that carry it.

Nested Paths

You can access nested attributes using dot notation:
{{agent.users.sam.flight_number}}     → agent.attributes["users"]["sam"]["flight_number"]
Example: If your agent has a nested structure:
{
  "users": {
    "sam": {
      "flight_number": "UA123",
      "email": "sam@example.com"
    }
  }
}
You can access it in your prompt:
Did the agent identify the flight as: {{agent.users.sam.flight_number}}.

Array Indexing

Access specific elements in arrays using bracket notation:
{{test_case.test[0]}}     → test_case.attributes["test"][0]
Example: If your test case has an array:
{
  "flight_options": ["United Airlines", "Delta", "American Airlines"]
}
You can reference specific flights:
The first flight option is {{test_case.flight_options[0]}}.
The assistant should mention {{test_case.flight_options[1]}} as an alternative.

Array Access Without Indexing

Access entire arrays without indexing - they’ll be returned as strings:
{{test_case.test}}   → test_case.expected_output_json["test"] (entire array as string)
{{agent.items}}      → agent.attributes["items"] (entire array as string)
Example:
The available options are: {{test_case.flight_options}}

Dynamic Keys via Multi-Pass Resolution

The system supports dynamic key resolution through multiple passes, allowing you to use one template variable to determine another:
{{agent.users.{{test_case.username}}.email}}
How it works:
  1. First pass: Resolves {{test_case.username}} → “user1”
  2. Second pass: Resolves {{agent.users.user1.email}} → “test@example.com
Example: If your test case specifies a username:
{
  "username": "user1"
}
And your agent has user-specific data:
{
  "users": {
    "user1": {
      "email": "user1@example.com",
      "tier": "premium"
    }
  }
}
You can use dynamic resolution:
The user {{test_case.username}} has email {{agent.users.{{test_case.username}}.email}}
and is a {{agent.users.{{test_case.username}}.tier}} member.

Complete Example

Here’s a comprehensive example combining multiple features: Agent attributes:
{
  "location": "San Francisco",
  "users": {
    "sam": {
      "tier": "premium",
      "perks": ["early_checkin", "room_upgrade"]
    }
  }
}
Test case:
{
  "username": "sam",
  "requested_perks": ["early_checkin"]
}
Metric prompt:
Given the transcript, did the assistant properly handle the reservation request?

Hotel Location: {{agent.location}}
Customer: {{test_case.username}}
Customer Tier: {{agent.users.{{test_case.username}}.tier}}
Available Perks: {{agent.users.{{test_case.username}}.perks}}
Requested Perks: {{test_case.requested_perks}}

Return YES if:
• The assistant confirmed the reservation is for {{agent.location}}
• The assistant recognized {{test_case.username}} as a {{agent.users.{{test_case.username}}.tier}} member
• The assistant mentioned available perks: {{agent.users.{{test_case.username}}.perks}}
• The assistant processed the requested perk: {{test_case.requested_perks[0]}}

Return NO if:
• The assistant provided incorrect location information
• The assistant failed to recognize the customer's tier status
• The assistant couldn't access the requested perk information

Use Cases

In Metric Prompts

Attributes are commonly used in metric prompts to create context-aware evaluations. See Metric Prompting Guide for examples of using attributes in evaluation metrics.

In Test Scenarios

You can embed agent attributes directly into test case scenarios and expected behaviors. This allows the same test set to work with different agents that have different attribute values. See Test Sets for more information.

In Criteria

Use attributes in criteria definitions to create dynamic validation that adapts to the specific agent or test case being evaluated. These criteria are used by the Composite Evaluation metric.