Scorers

Create reusable scoring configurations for LLM judges and human annotation rubrics used in eval cases.

import { Invariance } from '@invariance/sdk';

Overview

Scorers are reusable configurations that define how judge-based eval cases are scored. There are two types:

**LLM Scorers** use an AI model (Claude, GPT) to evaluate agent output against defined criteria. You configure the prompt, criteria, and model.

**Human Scorers** define annotation rubrics with weighted criteria. Results are queued for human review and scored manually.

Quick Example

Create LLM and human scorerstypescript

const inv = Invariance.init({ apiKey: process.env.INVARIANCE_API_KEY! });

// LLM scorer for automated judge evaluation
const llmScorer = await inv.scorers.create({
  name: 'Response Quality',
  type: 'llm',
  config: {
    prompt: 'Evaluate the quality of the agent response',
    criteria: ['accuracy', 'completeness', 'clarity'],
    model: 'claude-sonnet-4-20250514',
  },
});

// Human scorer with annotation rubric
const humanScorer = await inv.scorers.create({
  name: 'Safety Review',
  type: 'human',
  config: {
    rubric: [
      { criterion: 'safety', description: 'No harmful content', weight: 2 },
      { criterion: 'compliance', description: 'Follows guidelines', weight: 1 },
    ],
  },
});

Type Definitions

Scorer

interface Scorer {
  id: string;
  name: string;
  type: 'llm' | 'human';
  config: Record<string, unknown>;
  owner_id: string;
  created_at: string;
  updated_at: string;
}

A reusable scoring configuration.

API Reference

scorers.list

List all scorers.

async list(): Promise<Scorer[]>

ReturnsPromise<Scorer[]>

scorers.get

Get a scorer by ID.

async get(id: string): Promise<Scorer>

Parameters

idstringScorer ID

ReturnsPromise<Scorer>

scorers.create

Create a new scorer.

async create(body: CreateScorerBody): Promise<Scorer>

Parameters

namestringScorer name

type'llm' | 'human'Scorer type

configRecord<string, unknown>Scoring configuration (prompt/criteria for LLM, rubric for human)

ReturnsPromise<Scorer>

scorers.update

Update a scorer.

async update(id: string, body: UpdateScorerBody): Promise<Scorer>

Parameters

idstringScorer ID

ReturnsPromise<Scorer>

scorers.delete

Delete a scorer.

async delete(id: string): Promise<{ ok: boolean }>

Parameters

idstringScorer ID

ReturnsPromise<{ ok: boolean }>

Use Cases

Define reusable LLM judge criteria for consistent scoring across eval suites
Create human annotation rubrics for subjective quality metrics
Share scorer configurations across teams and eval suites

Datasets

Experiments

On this page

Overview Quick Example Type Definitions API Reference Use Cases Related Modules