Skip to main content
Editing a resource used to silently change how every past result that referenced it should be read — there was no record of what changed, and no way to see what a run actually executed against. Versioning fixes that. Coval keeps a version history for four resources:
ResourceWhat a version captures
MetricsThe scoring configuration (judge prompt, type-specific settings, manager).
Test setsThe test-set configuration and a snapshot of its test-case grid.
PersonasThe persona prompt, voice, language, and behavioral metadata.
AgentsThe connection and behavior configuration (model type, endpoint, prompt, workflows, attached metrics and test sets, and more).

Copy-on-save

Each resource has a current version (its live state) and a history of prior versions, kept newest first. The history grows automatically as you work — you don’t create versions by hand:
  • Every config-changing save snapshots the prior state. When you save a change that affects how the resource behaves, Coval records the previous configuration as a version and advances the current version to your new state. The history is the trail of states the resource has moved through.
  • Cosmetic and no-op edits don’t create a version. Only behavior-affecting configuration is versioned. Renaming a resource, editing its description, or re-saving without changing anything that matters leaves the history untouched — so the history stays meaningful and isn’t cluttered with edits that don’t change results.
  • Existing resources start versioning on their next save. Resources that predate versioning pick up their first version the first time they’re saved or used in a run, with nothing to migrate.
Test sets version the grid, not just the config. A test set’s behavior is defined by its test cases, so a test-set version snapshots an ordered copy of its test cases along with the test-set configuration. The snapshot is taken when you save the test set — editing cells in the editor and then saving records one version for the batch, not one per cell. Re-saving without changing any test-case content creates no new version.

Reviewing version history

You can review a resource’s history in two places:

Runs record the version they used

When you launch a run, Coval records which version of each resource the run executed against. Because the run is pinned to that snapshot, editing the resource afterward never changes how an existing run is interpreted — past results stay anchored to the configuration that produced them. The run view surfaces this so you can tell when a run is no longer on the latest configuration — see Runs.