Skip to main content
Version: Next

Quickstart

This path walks through the same loop shown in the Rubric Studio Open app: author, preview, calibrate, diff, and export.

1. Open the sample project

Launch the browser IDE or desktop app and open Helpful Response Evaluation. The sample project contains 12 criteria, 3 sample model outputs, local mock judges, and calibration fixtures.

2. Review a criterion

Select:

Evidence quality / cites-uncertainty.toml

Confirm the criterion has a clear description, examples, evidence requirements, weight, and status. The inline validation panel should show schema and style checks before you promote the criterion.

3. Score a sample

Use the local mock judge first. That keeps the first run deterministic and avoids sending content or keys to a model provider.

rubric score --judge mock --sample sample-001 --out runs/first-score

Expected shape:

rubric-spec: valid
samples: 3
criteria: 12
readiness: 100%
warnings: 0

4. Calibrate against gold labels

Open the Calibration tab and load the gold JSONL fixture. Review agreement metrics for criteria with low Cohen kappa or disagreement clusters. Criteria that need work should stay in draft until examples and wording are tightened.

5. Inspect the semantic diff

Open the Diff tab before committing a rubric change. The score overlay should show which criteria changed, how held-out samples moved, and whether the change is cosmetic or behavior-shaping.

6. Export artifacts

Export the portable files a reviewer can audit later:

rubric export manifest --run runs/first-score --out exports/eval-run-manifest.json
rubric export judge-card --run runs/first-score --out exports/judge-card.md

Use AuraOne intake export only when you intentionally want to hand the project to Rubric Studio Cloud or AuraOne reviewers.