Use cases/Evaluate

The eval that passed in CI shares its schema with the run that breaks.

No translation layer. CI cases use the same entities, edges, and recipes as a live discharge. A regression is a new red edge — not a percentage point.

See the product
Before
70% pass · no idea which discharges failed
With Invariance
Graph diff · 2 new red edges, both attending-skipped

How a run becomes a guardrail.

01

Freeze a run as a case

Any production run — discharge hold, med-recon, OR scheduling — promotes to a frozen dataset entry, with its inputs, expected entities, and expected edges.

02

Run from CLI, SDK, or UI

`inv eval run compliance-v2 --on run_hold_001`. Tagged runs sit in the same list as production.

03

Diff graphs across versions

Cell-by-cell: new findings, lost findings, new VIOLATED edges. Regression is visible, not statistical.

See it on your own workflow.

15 minutes. One workflow. No deck.

All use cases