Skip to main content
Expected Behavior and Attributes let you attach test-case-specific data to evaluate how the agent responds to each individual test case.

Expected behavior

Expected Behaviors describe how your agent should respond — the criteria the conversation is graded against after it finishes. Example
  • “the agent should ask the user for their phone number”
  • “the agent should repeat the phone number back to the user”
Expected Behaviors are an evaluation input, not a simulation input. They are never shown to the simulated user, so they don’t steer the conversation — the simulated user is driven only by the test case Simulation Input and the persona. Expected Behaviors are read after the run to score the resulting transcript.
Use the Composite Evaluation metric to evaluate whether the agent followed the expected behaviors. Configure it with From Test Case as the criteria source to automatically pull behaviors from each test case. Each behavior is judged independently against the transcript as met or not met, and the score is the fraction met. With Percentage of Criteria Met reporting, the example above would return 0.5 if the agent asks for the phone number but does not repeat it back.

Attributes

Attributes are structured fields you attach to a test case. Think of each attribute as a column in your test set — every test case carries its own value for that column. Putting data in an attribute, instead of only describing it in free text inside the Simulation Input, lets you:
  • Deterministically validate — reference the attribute in a metric to grade the agent against a known-correct value, rather than relying on an LLM to infer it from the scenario wording.
  • Sort and filter — organize and slice your test cases by their attribute values.
  • Keep track of anything — ticket numbers, expected outcomes, customer tier, or any other data you want tied to the case.
You can input them as key/value pairs or as JSON. Example: Imagine an airline help desk where the test case contains these attributes
{
  "source": "LAX",
  "destination": "SFO"
}
Then, you can write, for example, a binary Destination Identification Metric with the question: Did the agent correctly identify the destination as: {{test_case.destination}}?

Utilizing Agent Attributes

In your agents, you can set specific attributes associated with that agent. You can embed these agent attributes into your scenarios with this format: {{agent.attribute_name}} Example: Imagine one agent has the attribute location with a value “San Francisco”, and another agent has the value “London”. Embed those agent attributes in your scenarios and expected behaviors like this: Scenario: You are a user calling for travel recommendations in {{agent.location}} Expected Behavior: The agent should only give travel recommendations in {{agent.location}}