Docs/Advanced/Scorers

Scorers

Create reusable scoring configurations for LLM judges and human annotation rubrics used in eval cases.

import { Invariance } from '@invariance/sdk';

Overview

Scorers are reusable configurations that define how judge-based eval cases are scored. There are two types:

**LLM Scorers** use an AI model (Claude, GPT) to evaluate agent output against defined criteria. You configure the prompt, criteria, and model.

**Human Scorers** define annotation rubrics with weighted criteria. Results are queued for human review and scored manually.

Quick Example

Create LLM and human scorerstypescript
const inv = Invariance.init({ apiKey: process.env.INVARIANCE_API_KEY! });

// LLM scorer for automated judge evaluation
const llmScorer = await inv.scorers.create({
  name: 'Response Quality',
  type: 'llm',
  config: {
    prompt: 'Evaluate the quality of the agent response',
    criteria: ['accuracy', 'completeness', 'clarity'],
    model: 'claude-sonnet-4-20250514',
  },
});

// Human scorer with annotation rubric
const humanScorer = await inv.scorers.create({
  name: 'Safety Review',
  type: 'human',
  config: {
    rubric: [
      { criterion: 'safety', description: 'No harmful content', weight: 2 },
      { criterion: 'compliance', description: 'Follows guidelines', weight: 1 },
    ],
  },
});

Type Definitions

Scorer
interface Scorer {
  id: string;
  name: string;
  type: 'llm' | 'human';
  config: Record<string, unknown>;
  owner_id: string;
  created_at: string;
  updated_at: string;
}
A reusable scoring configuration.

API Reference

scorers.list
List all scorers.
async list(): Promise<Scorer[]>
ReturnsPromise<Scorer[]>
scorers.get
Get a scorer by ID.
async get(id: string): Promise<Scorer>
Parameters
idstringScorer ID
ReturnsPromise<Scorer>
scorers.create
Create a new scorer.
async create(body: CreateScorerBody): Promise<Scorer>
Parameters
namestringScorer name
type'llm' | 'human'Scorer type
configRecord<string, unknown>Scoring configuration (prompt/criteria for LLM, rubric for human)
ReturnsPromise<Scorer>
scorers.update
Update a scorer.
async update(id: string, body: UpdateScorerBody): Promise<Scorer>
Parameters
idstringScorer ID
ReturnsPromise<Scorer>
scorers.delete
Delete a scorer.
async delete(id: string): Promise<{ ok: boolean }>
Parameters
idstringScorer ID
ReturnsPromise<{ ok: boolean }>

Use Cases

  • Define reusable LLM judge criteria for consistent scoring across eval suites
  • Create human annotation rubrics for subjective quality metrics
  • Share scorer configurations across teams and eval suites
On this page
OverviewQuick ExampleType DefinitionsAPI ReferenceUse CasesRelated Modules