Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

120 changes: 120 additions & 0 deletions scripts/mythos/HOWTO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Mythos-Style Bug Hunt — Portable Pipeline

A four-prompt pipeline modeled on Anthropic's Claude Mythos (red.anthropic.com,
April 2026) plus Vidoc's open-model reproduction. The architecture is: let
the agent reason about code freely, but require a machine-checkable oracle
for every reported bug so hallucinations don't ship.

## Prerequisites

- Claude Code or any agent harness that can read files and drive test runs
- A truth oracle for your language/domain (see §5)
- A bug-tracking format (STPA-Sec, STPA, in-house, whatever)
- Optional: parallel sessions (rank → N parallel discoveries → validate → emit)

## 1. Four prompt templates in `scripts/mythos/`

- **`rank.md`** — agent ranks every source file 1–5 by bug likelihood. The
rubric is the one non-portable part — write it per repo (§2).
- **`discover.md`** — Mythos-verbatim discovery prompt plus repo-specific
context plus the oracle requirement (§3).
- **`validate.md`** — fresh-agent validator that enforces the oracle and
filters uninteresting findings.
- **`emit.md`** — converts a confirmed finding into a draft entry in your
bug-tracking format.

## 2. Ranking rubric (non-portable)

5 tiers, named by concrete path patterns not abstract categories. Skeleton:

```
5 (crown jewels): secrets, parse-before-trust, canonicalization
4 (direct security boundary): verification, signing, argv+env
3 (one hop from untrusted input): token parsers, network clients, format parsers
2 (supporting, no direct security role): HTTP plumbing, policy eval, logging
1 (config / constants / proof artifacts): error types, wiring, proofs
```

Straddle rule: if a file sits between two tiers, pick the higher. Run the
rank pass once, then **patch the rubric** to eliminate files that required
overrides. A good rubric produces zero overrides on re-run.

## 3. Oracle choice (drives `discover.md`)

The oracle separates "agent thinks there's a bug" from "there is a bug."

| Hunting… | Oracle candidates |
|---|---|
| Memory corruption in C/C++/unsafe Rust | AddressSanitizer, MemorySanitizer, UBSan |
| Logic bugs in safe Rust | Kani + property tests (proptest/quickcheck) |
| Compiler correctness | Rocq + Z3 SMT + differential testing |
| Kernel primitives | Verus + Kani + Rocq; proof-skip analysis |
| Python/TypeScript | Hypothesis, fast-check, concrete PoC |
| Go | fuzz, property tests |
| Crypto protocols | Proverif, Tamarin, CryptoVerif counterexample |

`discover.md` MUST require BOTH (1) a failing machine-checkable proof AND
(2) a failing concrete PoC. "If you cannot produce both, do not report.
Hallucinations are more expensive than silence." — load-bearing sentence.

## 4. Run the pipeline

From a Claude Code session in the repo:

1. `Read scripts/mythos/rank.md` → JSON ranking
2. For each rank-≥4 file: new session (parallel), paste `discover.md` with
`{{file}}` substituted. Output = structured finding report.
3. For each finding: fresh session with `validate.md`. Both oracle halves
must fail on unfixed code. Reject anything that doesn't confirm.
4. For each confirmed: `emit.md` produces a `draft` tracking entry. Human
promotes to `approved`.

One agent per file in step 2 is Mythos's parallelism trick. Don't run one
agent across the whole codebase.

## 5. Per-project customization

- **`rank.md`**: your threat model in 5 tiers
- **`discover.md`**: repo context paragraph + oracle requirement + optional
hypothesis priors (e.g., wasmtime 2026-04-09 CVE wave for any WASM tool)
- **`validate.md`**: reject against your known-mitigations / system
constraints / existing scenarios. Swap threat-agent checks for
hazard-only checks if the repo is safety not security.
- **`emit.md`**: match the exact YAML/JSON shape of your artifact store.

## 6. Gotchas

- **Failing tests directly in source break CI.** Use `#[ignore]` / `@skip`
and put the rerun command in the ignore reason.
- **The rubric is wrong the first time.** Expect to patch after pass 1.
Sign you need to patch: "straddle rule → promoted X" lines in output.
- **Validators must be fresh sessions.** Reusing discovery context lets
the agent defend its own hypothesis.
- **One agent per file, not per codebase.** Parallel agents on different
files find diverse bugs; a single agent converges on surface issues.
- **Keep the discovery prompt minimal.** Mythos's "Please find a security
vulnerability" outperforms elaborate CWE checklists because the agent
has tools (oracle, debugger, runtime) and the environment filters truth.

## 7. Worked example — sigil `signature/sig_sections.rs`

First tier-5 file produced a finding:

```rust
let certificate_chain = if let Ok(cert_count) = varint::get32(&mut reader) {
// ... read chain
} else {
None // ← silently swallows ALL parse errors, not just EOF
};
```

Intent: backward-compat (missing cert_count → None). Bug: any error —
including malformed bytes — gets converted to "no chain," downgrading a
cert-based signature to a bare-key signature.

- **PoC test**: append 5 MSB-set bytes after a valid prefix; expect `Err`;
current code returns `Ok { certificate_chain: None }`. **Confirmed failing.**
- **Kani harness**: symbolic 5-byte cert_count with MSB-set constraint;
`assert!(result.is_err())`.

Maps to STPA-Sec UCA-6. Emitted as `draft AS-N` under UCA-6.
68 changes: 68 additions & 0 deletions scripts/mythos/discover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Please find a correctness-relevant vulnerability in this program.

Context you must use:
- This is spar, a toolchain for AADL v2.2/v2.3 (Architecture Analysis
and Design Language, SAE AS-5506D) with a deployment solver. The
pipeline is: AADL text → parser → syntax → HIR → analysis →
transform → solver → codegen/render. The invariant: every valid
AADL input produces a semantically faithful model, and every
emitted deployment satisfies the input's constraints.
- Focus on ONE file: {{file}}. You may read any other file to
confirm/refute, but do not report bugs outside {{file}}.
- Spar's output drives deployment decisions on embedded safety-
critical systems. A solver that emits an infeasible schedule as
feasible is a functional-safety event.

Hypothesis priors (AADL-specific + classical compiler-frontend classes):
Use as anchors, NOT recipes.
- Parser: accepts non-AADL input; rejects a valid AADL construct
in edge cases (mixed-case keywords, unusual Unicode in
identifiers, nested comments).
- Syntax: typed-tree invariants violated (a node has fewer
children than the grammar guarantees — unwrap panics later).
- HIR: sub-component references resolved against the wrong scope;
feature-group inversion not handled correctly.
- Annex handling: unknown annex body text silently skipped or
forwarded to a no-op handler.
- Analysis: a property that is spec-required-to-be-static
computed dynamically; a latency bound that assumes a specific
scheduler policy not checked against the model.
- Solver: feasibility check uses float arithmetic where rational
is required; integer overflow on task period × priority
product; solver reports UNSAT for SAT inputs due to a wrong
encoding of deadline constraints.
- Incremental DB: stale cache returned after source change
(salsa-style bugs); query cycle that should be detected as
an error silently returns default.
- Proof-code drift: a Kani harness proves property P for a
function F, but F has since been refactored such that P no
longer matches the function's actual contract.

To help us appropriately triage any bug you find, please write a
minimal AADL input that triggers the bug.

Oracle requirement (non-negotiable):
For every bug you report you MUST produce both:
(1) A failing Kani harness using `#[kani::proof]` — spar has 180
existing proofs, style is established. Harness must fail on
unfixed code.
(2) A failing property or unit test (standard `#[test]` or
`proptest!`) that reproduces the bug with a concrete AADL
input or IR fragment. For solver bugs, a natural PoC is a
differential check: solver result vs. hand-solved reference.

If you cannot produce both, the finding does not count.
Do not report it. Hallucinations are more expensive than silence.

Output format:
- FILE: {{file}}
- FUNCTION / LINES: ...
- HYPOTHESIS: one sentence
- KANI HARNESS: fenced Rust block
- POC TEST: fenced Rust block (AADL input or IR fragment)
- IMPACT: which hazard this enables; whether it's parser
over-acceptance/under-acceptance, HIR semantic drift, analysis
unsoundness, solver miscount, or proof-code drift
- CANDIDATE UCA: the single most likely `UCA-N` from
`safety/stpa/ucas.yaml` (consult the actual UCA set there).
Cite the AS-5506D section the bug violates.
33 changes: 33 additions & 0 deletions scripts/mythos/emit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
You are emitting a new entry to spar's safety artifact store.
Consult `safety/stpa/` for the existing shape. Spar's safety
directory uses several analysis files (`analysis.yaml`,
`validation.yaml`, `security.yaml`, `solver-analysis.yaml`) — pick
the one whose category best fits the finding, or the equivalent of
loss-scenarios if it exists.

Input:
- Confirmed bug report (below)
- Chosen `UCA-N` from the validator
---
{{confirmed_report}}
UCA: {{uca_id}}
---

Rules:
1. Read `safety/stpa/` first. Match the existing field shape
exactly. Do not invent fields.
2. Grouping invariant: new entries are siblings of existing ones
under the same UCA.
3. In the prose, reference the Kani harness and the PoC test by
fully-qualified Rust path. Cite the AS-5506D section the bug
violates (e.g., "AADL v2.2 §11.2.1 feature-group mapping").
4. If this is proof-code drift, say so explicitly — drift requires
re-verification or code reversion, not a primitive fix.
5. Optional: classify the finding under `category` with one of:
parser-over-acceptance, parser-under-acceptance, hir-drift,
analysis-unsoundness, solver-miscount, proof-drift,
codegen-divergence. Use the existing schema if present.
6. Set `status: draft`. Deployments built on spar output must not
consume draft findings until a human promotes them.

Emit ONLY the artifact YAML block, nothing else.
Loading
Loading