Independent plan/spec reviewer for AI coding agents — Multi-judge LLM-as-a-Judge with structured verdicts
Analyzes implementation plans the way Hari Seldon analyzed civilizations — by checking structural assumptions against reality before things go wrong.
Install •
Usage •
Focus Modes •
External Judges •
Output Schema
Feed Seldon a plan, spec, or proposal. It reads the document, inspects your workspace for evidence, and returns a verdict — approve, approve_with_changes, or request_major_revision — with a confidence score (0–1) and concrete findings tagged by severity and file references.
Works out of the box as an inline skill for Claude Code. Plug in Codex, OpenAI, or Anthropic as an external judge for true model independence — or run all three and compare verdicts side by side.
This plugin ships one skill: Seldon, with three pluggable judges.
Independent plan/spec reviewer with workspace verification.
- Multi-judge LLM-as-a-Judge: Plug in Anthropic (Claude), OpenAI (GPT), or Codex (via the Codex plugin) — or fall back to inline review using the current agent.
- Structured Verdicts: Every response conforms to a strict JSON Schema (
approve,approve_with_changes,request_major_revision+ confidence + findings). - Workspace Verification: The inline reviewer and codex judge can traverse your codebase to verify claims; the API runners get every file you pass as arguments.
- Focus Modes: Six pre-baked weighting profiles —
balanced,architecture,evaluation,product,operations,safety. - Severity-Tagged Findings: Each finding lists severity (
critical/high/medium/low), why it matters, evidence from the workspace, andpath:linereferences. - Confidence Score: A numeric 0–1 score from the judge, rendered as a 20-segment visual bar with semantic color labels.
- Schema-Validated Output: A bundled
scripts/validate.shsmoke-test harness checks runner output againstseldon.schema.jsonbefore you trust it.
- Claude Code (CLI) or Claude Desktop — no API keys required for the inline reviewer
- Optional, per external judge:
codex→ install the Codex plugin (/codex:setup)anthropic→ exportANTHROPIC_API_KEYopenai→ exportOPENAI_API_KEY
- For schema validation tooling (optional):
python3withjsonschema(see Testing)
Install via Claude Code's built-in plugin system:
# Add the marketplace
/plugin marketplace add proyecto26/seldon
# Install the plugin
/plugin install seldonAfter installing, the /seldon skill triggers automatically on phrases like "review my plan", "judge this spec", "second opinion on this RFC".
# Install all skills
npx skills add proyecto26/seldon
# List available skills
npx skills add proyecto26/seldon --listThis installs to your .claude/skills/ directory.
git clone https://github.com/proyecto26/seldon.git
cp -r seldon/skills/* .claude/skills/git submodule add https://github.com/proyecto26/seldon.git .claude/seldonReference the skill from .claude/seldon/skills/seldon/.
- Fork this repository
- Customize
skills/seldon/SKILL.mdfor your house style (rubric weights, finding format) - Add or modify judge runners in
skills/seldon/scripts/ - Clone your fork into your projects
Trigger phrases that fire the skill:
/seldon my-plan.md
review this plan: docs/migration-plan.md
judge this spec: docs/auth-redesign.md
second opinion on RFC-042
- Seldon reads your plan file (and any supporting files you pass)
- Resolves the judge:
auto(default) — probes for codex →ANTHROPIC_API_KEY→OPENAI_API_KEY→ falls back to inline- explicit — say "judge with codex" / "use the openai judge" / "use anthropic"
- Inspects the workspace to verify claims — file paths, APIs, dependencies, config, schema
- Evaluates against a rubric: repo fit, correctness, sequencing, evaluation, safety
- Returns a structured verdict with a visual confidence bar:
🟡 Confidence ████████████████░░░░ 0.82 (moderate)
"Review my migration plan"
Triggers the inline reviewer (or auto-detected external judge) on the file you provide.
"Judge this RFC with anthropic, focus on safety"
Routes to
scripts/anthropic.shwith--focus safety.
"Get a second opinion from codex on docs/plan.md"
Routes to
scripts/codex.shvia the Codex plugin companion.
"Run all three judges and compare"
Invokes each runner in turn and renders a side-by-side comparison of verdicts and confidence scores.
Judge: codex (scripts/codex.sh)
Verdict: approve_with_changes
Summary: Plan is sound but assumes a migration path that does not exist yet.
🟡 Confidence ████████████████░░░░ 0.82 (moderate)
Strengths:
- Clear phasing with realistic scope per step
- Good rollback strategy for the data migration
Blocking findings:
high — Migration depends on schema v3 which hasn't been created
Why it matters: Step 2 cannot begin without this prerequisite
Evidence: No v3 migration file exists in prisma/migrations/
Refs: prisma/schema.prisma:42, docs/plan.md:18
Open questions:
- Is the external billing API rate limit sufficient for the proposed batch size?
Focus modes weight the review toward specific concerns. Default is balanced.
| Mode | Emphasis |
|---|---|
balanced |
All rubric dimensions evenly |
architecture |
Service boundaries, dependencies, migration risk, hidden integration work |
evaluation |
Success criteria, regression detection, testability of quality claims |
product |
User-visible failure modes, sequencing, scope realism |
operations |
Rollout, alerting, rollback, failure handling, maintenance burden |
safety |
Privacy, security, hallucination controls, access assumptions |
/seldon --focus safety docs/auth-redesign.mdBy default, Seldon runs inline — the current agent performs the review using the workspace. To get a model-independent second opinion, plug in one of three external judges. The skill auto-detects which is available.
| Runner | LLM | Workspace access | Required setup |
|---|---|---|---|
scripts/codex.sh |
gpt-5.4 (default) via Codex companion | ✅ Read-only sandbox | Install the Codex plugin and run /codex:setup |
scripts/anthropic.sh |
claude-sonnet-4-6 (default) | ❌ Sees only files passed as args | export ANTHROPIC_API_KEY=… |
scripts/openai.sh |
gpt-4o (default) | ❌ Sees only files passed as args | export OPENAI_API_KEY=… |
Routes through the Codex plugin's codex-companion.mjs task runner. The Codex agent can read other workspace files to verify claims.
| Environment Variable | Default | Description |
|---|---|---|
JUDGE_MODEL |
gpt-5.4 |
Codex model |
JUDGE_REASONING |
xhigh |
Reasoning effort |
Direct call to OpenAI Chat Completions with response_format=json_object. Sends plan content in the prompt — only files passed as arguments are visible.
| Environment Variable | Default | Description |
|---|---|---|
JUDGE_MODEL |
gpt-4o |
Model to use |
Direct call to Anthropic Messages. Useful for a second opinion within the Anthropic ecosystem (e.g., judge a Claude Code session with a fresh Claude instance).
| Environment Variable | Default | Description |
|---|---|---|
JUDGE_MODEL |
claude-sonnet-4-6 |
Model to use |
Add a new scripts/<name>.sh that:
- Accepts
[--focus <mode>] <plan-file> [supporting-files...] - Reads
seldon.schema.jsonfrom the skill root (orscripts/as fallback) - Emits JSON matching that schema on stdout
- Exits non-zero with diagnostics on stderr for any failure (auth error, schema not found, malformed model output)
See scripts/codex.sh for a fully worked example including markdown-fence stripping and API-level error detection.
Every runner returns JSON conforming to skills/seldon/seldon.schema.json (JSON Schema Draft 2020-12):
{
"verdict": "approve | approve_with_changes | request_major_revision",
"summary": "1–3 sentence assessment",
"confidence": 0.82,
"strengths": ["..."],
"blocking_findings": [
{
"severity": "critical | high | medium | low",
"title": "Short description of the issue",
"why_it_matters": "Impact if unaddressed",
"evidence": "What was found in the workspace",
"references": ["src/api.ts:42", "docs/plan.md:18"]
}
],
"non_blocking_findings": [],
"open_questions": ["Things that couldn't be verified locally"]
}| Range | Label |
|---|---|
| 0.90 – 1.00 | 🟢 High confidence |
| 0.70 – 0.89 | 🟡 Moderate confidence |
| 0.50 – 0.69 | 🟠 Low confidence |
| 0.00 – 0.49 | 🔴 Very low confidence |
seldon/
├── .claude-plugin/
│ ├── plugin.json # Plugin manifest
│ └── marketplace.json # Marketplace configuration
└── skills/
└── seldon/ # The reviewer skill
├── SKILL.md # Skill instructions (third-person trigger phrases)
├── seldon.schema.json # JSON Schema for verdict objects
├── requirements.txt # Optional: jsonschema for validate.sh
├── examples/
│ ├── demo_plan.md # Short runnable plan for smoke tests
│ └── sample_verdict.json # Schema-conforming verdict fixture
└── scripts/
├── codex.sh # Judge runner: Codex plugin companion
├── anthropic.sh # Judge runner: Anthropic Messages API
├── openai.sh # Judge runner: OpenAI Chat Completions
└── validate.sh # E2E harness: run a judge + validate JSON
To exercise a real LLM round-trip end-to-end and validate the JSON output against the schema:
# One-time: set up a venv with jsonschema (avoids PEP 668 on macOS)
python3 -m venv skills/seldon/.venv
skills/seldon/.venv/bin/pip install -r skills/seldon/requirements.txt
# Auto-detect: codex → ANTHROPIC_API_KEY → OPENAI_API_KEY
bash skills/seldon/scripts/validate.sh skills/seldon/examples/demo_plan.md
# Force a specific judge
bash skills/seldon/scripts/validate.sh --judge codex skills/seldon/examples/demo_plan.md
bash skills/seldon/scripts/validate.sh --judge anthropic skills/seldon/examples/demo_plan.md
bash skills/seldon/scripts/validate.sh --judge openai skills/seldon/examples/demo_plan.mdEach invocation prints a one-line verdict + confidence summary, then the full JSON if validation passed.
Seldon works with any agent that supports SKILL.md skills:
- Claude Code (recommended — full plugin support)
- Claude Desktop via Cowork plugin installation
- Gemini CLI
- Any agent supporting the skills.sh ecosystem
Named after Hari Seldon from Isaac Asimov's Foundation series. Seldon developed psychohistory — a science that predicted the future of civilizations by analyzing structural assumptions against reality. At critical decision points, a holographic Seldon would appear and say:
"If you're seeing this, here's what you got wrong."
That's what /seldon does for your implementation plans.
This project is free and open source. Sponsors help keep it maintained and growing.
Become a Sponsor | Sponsorship Program
When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated ❤️.
You can learn more about how you can contribute to this project in the contribution guide.
- Originally authored by @degrammer — the Hari Seldon analogy and the core inline-review concept.
- Inspired by the broader LLM-as-a-Judge research line.
Made with ❤️ by Proyecto 26 - Changing the world with small contributions.
One hand can accomplish great things, but many can take you into space and beyond! 🌌
Together we do more, together we are more ❤️
MIT — see LICENSE
