Skip to content

docs(secrets): scoped-store v1 spec — secrets.<scope>.yaml with per-scope keys#358

Draft
Cre-eD wants to merge 11 commits into
mainfrom
design/scoped-store-v1
Draft

docs(secrets): scoped-store v1 spec — secrets.<scope>.yaml with per-scope keys#358
Cre-eD wants to merge 11 commits into
mainfrom
design/scoped-store-v1

Conversation

@Cre-eD

@Cre-eD Cre-eD commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

What

Implementation spec for the keyless-secrets RFC's Minimal v1: per-scope secret files (secrets.pr.yaml, secrets.staging.yaml, secrets.prod.yaml, …) with per-scope recipient keys, so a CI context holds a key that opens exactly the scope files it needs — replacing the single master SIMPLE_CONTAINER_CONFIG that decrypts everything.

Spec: docs/design/keyless-secrets/scoped-store-v1.md. Highlights:

  • SOPS file-per-scope (per the RFC decision), age recipients in v1 (KMS/OIDC = v2), committed-encrypted with diffable structure.
  • Governance: CODEOWNERS-gated .sc/scopes.yaml is the single scope→recipient surface; sc secrets lint fails on SOPS-metadata drift + plaintext leaks.
  • Deterministic ${secret:} resolution (scope file → legacy store), hard-fail on missing/undecryptable — never a partial deploy.
  • Strict mode-A compatibility: legacy store untouched; scoped files are additive and never opened by old binaries (fail-closed via the shipped schemaVersion guard pattern).
  • Consumer payoff: pull_request jobs (previews, provision-preview) get SC_KEY_PR only and stop receiving SC_CONFIG — closes the biggest remaining PR-reachable secret exposure.

Process

Draft for design review (panel review planned like Phase 1). Implementation lands in this same PR after sign-off — kept as one consolidated PR per maintainer preference.

RFC: #346 (merged). ClickUp: 86caf67c3.

…aml)

Concrete spec for the RFC's Minimal v1: SOPS file-per-scope
(.sc/stacks/<stack>/secrets.<scope>.yaml), age recipients governed by a
CODEOWNERS-gated .sc/scopes.yaml, scope-aware CLI verbs, deterministic
deploy-time resolution with hard-fail-on-missing, plaintext-leak lint, strict
mode-A backward compatibility (additive files old binaries never open), CI
demotion plan (pull_request jobs get a pr-scope key instead of the master
SC_CONFIG). Implementation lands in this same PR after design sign-off.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Semgrep Scan Results

Repository: api | Commit: a8b9438

Check Status Details
⚠️ Semgrep Warning 1 warning(s), 5 total

Scanned at 2026-07-04 10:45 UTC

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Security Scan Results

Repository: api | Commit: a8b9438

Check Status Details
✅ Secret Scan Pass No secrets detected
✅ Dependencies (Trivy) Pass 0 total (no critical/high)
✅ Dependencies (Grype) Pass 0 total (no critical/high)
📦 SBOM Generated 523 components (CycloneDX)

Scanned at 2026-07-04 10:45 UTC

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

📊 Statement coverage

Measured on the documented included set (see docs/TESTING.md → Coverage scope). Observe-only — no regression gate is enforced yet.

Scope This PR main baseline Δ
Included set (Gold-tier denominator) 89.6% 90.3% -0.7 pp
Full set (whole repo, transparency) 29.0% 28.0% +1.0 pp

Baseline: main @ d64c472

Cre-eD added 10 commits July 4, 2026 00:49
Panel decision (Codex + Gemini + 2 Claude lenses):
- D1 scan-only: v1 scopes the 4 PR scan/lint jobs; deploy-shaped PR jobs and
  crossguard's Pulumi creds stay out (crossguard -> OIDC, never the pr scope).
- D2 SC_KEY_PR interim + committed KMS/OIDC v2 (reuses existing KMS+OIDC infra;
  recipient swap on the same files, stored key retired at v2).
- D3 scope files in the devops parent (integrail) store; resolver still supports
  consumer-repo scopes for the later deploy sweep.

Bakes in the 7 consensus P0s: parent/child merge constraint, real hard-fail
(not swallowed warn), scope-name + path:scope:key AAD binding, sc recipient-verify
lint, preview-deploy exclusion, secretScope PR-clamp, SC_KEY_PR blast-radius bound.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Implements the v1 scoped-secret store on sc's OWN cipher layer (RSA-OAEP +
X25519 sealed box) rather than pulling in SOPS/filippo.io-age — neither is a
current dependency and adding them would be a large supply-chain surface. Value
confidentiality, per-recipient sealing, and AEAD/OAEP associated-data binding
are all provided by the existing ciphers package. Recipients are SSH public keys
(ssh-ed25519 / ssh-rsa), consistent with the whole-file store.

- ciphers: add EncryptLargeStringWithAAD / DecryptLargeStringWithAAD /
  DecryptLargeStringWithEd25519AAD. A nil AAD reproduces the exact legacy wire
  format (RSA OAEP label / X25519 associated data), so the whole-file store is
  byte-for-byte unchanged and all existing cipher tests pass.
- scoped: scopes.yaml governance model (per-scope recipient sets, allow/disallow,
  fail-closed schemaVersion guard) + secrets.<scope>.yaml file format
  (committed-encrypted, structure readable / values opaque), per-recipient
  sealing keyed by SHA256 SSH fingerprint, set/get/delete, and offline
  VerifyConsistency for lint.
- Security bindings (P0-3): each value's AAD is domain-separated scope\x00key, so
  a ciphertext cannot be transplanted to another scope or key; LoadScopeFile
  rejects a file whose in-name scope != filename scope (rename attack).
- Tests: RSA + ed25519 round-trip (incl multi-chunk), multi-recipient, non-
  recipient rejection, scope+key transplant resistance, rename detection,
  fail-closed version guard, consistency drift, governance, name validation.

Whole-file store, CLI wiring, and deploy-time resolution are unchanged in this
commit and follow next.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
The RFC named SOPS, but the implementation reuses sc's own ciphers (no
getsops/sops or filippo.io/age dependency). Update the spec's crypto/format
layer, recipient type (SSH keys, not native age), integrity model (AEAD/OAEP
scope\x00key binding + filename/scope check, not SOPS MAC), lint gates, and
compatibility section to match what landed. Status -> IN PROGRESS.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…ow/lint/doctor)

Adds the `sc secrets scope` command group over the scoped store, namespaced so
it never collides with the whole-file store's existing allow/disallow/add/etc.

- set (value arg or stdin), get (ambient key), list, delete
- allow/disallow: update .sc/scopes.yaml then reseal every scope file of that
  scope to the new recipient set (Reencrypt: decrypt-with-current-key then
  re-seal, all-or-nothing); disallow prints the mandatory rotate-values warning
  since removal does not rewrite git history
- lint: per-file VerifyConsistency + recipient set == scopes.yaml (drift fails)
  + scope/filename binding — the CI gate
- doctor: which scopes the ambient key can open
- set refuses to write into a file whose recipients have drifted from scopes.yaml
  (reconcile via allow/disallow first), so a value is never sealed to a stale set
- package: ScopeFile.Reencrypt, path helpers (ScopesPath/StackDir/ScopeFilePath/
  ListScopeFiles), exported Fingerprint/SameRecipients; Save now MkdirAll's parents

Verified end-to-end on a real repo: allow->set(arg+stdin)->list->get(correct
plaintext)->lint OK->doctor YES; committed file is encrypted with readable
structure and zero plaintext leak. Package tests: 20 green.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…ini)

No P0s — both reviews confirmed the crypto core (transplant resistance,
backward-compat, per-file atomic reseal) is sound. Fixes for the CLI/lint findings:

- [Codex P1] scope get/doctor now resolve the decrypt key from --key-file, then
  SC_KEY_<SCOPE> / SC_SCOPE_KEY env, then the ambient config — so a pull_request
  scan job can hold ONLY SC_KEY_PR without a full SIMPLE_CONTAINER_CONFIG.
  Verified: get returns the value with only SC_KEY_PR set.
- [Gemini P1] reconcileRecipients (allow/disallow) is now two-phase: reseal every
  affected file in memory first, persist only after all succeed — a mid-way
  decrypt/parse failure can no longer leave files/scopes.yaml drifted.
- [Gemini P2] the private key is fetched lazily — declaring the first recipient of
  an empty scope needs no ambient key.
- [Codex P2] VerifyConsistency (lint) now base64-decodes every chunk and requires
  >= AEAD-tag length, rejecting a plaintext value smuggled under a valid recipient
  fingerprint — the offline plaintext-leak gate now has teeth.
- [Codex P2] Scopes.Allow rejects unsupported recipient key types (ECDSA, certs)
  that fingerprint but cannot be sealed to, failing fast at governance time.

Tests: 22 green (added plaintext-chunk + unsupported-key cases). Re-verified
end-to-end: allow->set->second-allow(reseal)->get-via-SC_KEY_PR->lint->ecdsa-reject.
Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Wires scoped secrets into the deploy read path (provisioner
readSecretsDescriptorFromFile): after loading a stack's whole-file values, merge
in every secrets.<scope>.yaml the AMBIENT KEY is a recipient of.

Design improvement over the RFC's secretScope: field — resolution is key-driven,
not config-driven, which DROPS P0-6 (PR-manipulable scope selection). A job
holding only SC_KEY_PR is a recipient of the pr scope alone, so it cannot decrypt
secrets.prod.yaml — the pull_request clamp is cryptographic, with no config to
subvert.

scoped.ResolveScopedValues semantics:
- no key / no scope files -> empty, no error (repos without scopes unaffected;
  the key is not even parsed). Verified: provisioner tests unchanged.
- scope files the key can't open are skipped (least privilege).
- whole-file store wins on conflict; scoped values only ADD new keys, so an
  existing ${secret:} resolution can never change.
- HARD-FAIL (P0-2) on: a value the key IS a recipient of but can't decrypt
  (tamper), a corrupt/renamed scope file, or a key present in two openable scopes
  (ambiguous). Not being a recipient is not an error.

secret-get and ${secret:} both see merged scoped values transparently. Spec
updated to the key-driven model. Tests: key-determines-scope (A sees pr not prod,
B sees prod not pr, non-recipient sees nothing) + cross-scope-dup fail; scoped +
provisioner suites green; whole module builds.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
- [P1] Resolve scoped-only stacks: readSecretsDescriptor no longer bails on a
  missing legacy secrets.yaml before scope resolution — a stack with only
  secrets.<scope>.yaml now resolves (empty descriptor + scoped merge). If neither
  a legacy file nor any openable scoped value exists, it still returns the
  ignorable not-found (behavior preserved for secret-less stacks).
- [P1] Deploy path honors CI scope keys: ResolveScopedValues now takes multiple
  candidate keys, and the provisioner gathers the ambient config key PLUS
  SC_SCOPE_KEY and any SC_KEY_<SCOPE> (e.g. SC_KEY_PR) from the env — so a
  pull_request job holding only its scope key resolves scoped secrets via
  ${secret:} and sc stack secret-get, matching the CLI's key resolution.
- [P2] Integrity hard-fails are never swallowed: scoped resolver failures (tamper,
  corrupt/renamed file, ambiguous cross-scope key) are tagged scoped.ErrScopedIntegrity
  and ReadStacks propagates them even under IgnoreSecretsMissing — so a deploy can
  no longer proceed past a broken scope file just because the affected secret
  wasn't referenced. Not being a recipient remains a silent skip (least privilege).

Tests: added multi-candidate-key resolution + ErrScopedIntegrity assertions;
scoped + provisioner suites green; whole module builds.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Verifies the three Codex-review fixes end-to-end at the provisioner:
- scoped-only stack (no legacy secrets.yaml) resolves via a CI scope key from the
  environment (SC_SCOPE_KEY, no SIMPLE_CONTAINER_CONFIG) — P1a + P1b;
- a corrupt scope file surfaces as scoped.ErrScopedIntegrity even when unreferenced
  — P2 (never swallowed);
- a stack with neither legacy nor openable scoped secrets still reports plain
  not-found (os.ErrNotExist), preserving IgnoreSecretsMissing behavior.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…lleability) + P1s

Full multi-model panel (4 Claude lenses + Codex + Gemini). Two real P0s, both fixed
with acceptance tests; the panel's third P0 (ed25519 whole-file format changed) was a
false positive — main already used X25519 for ed25519, so the nil-AAD path is
byte-identical (now proven by TestAAD_NilRoundTrip_LegacyCompatible).

P0-A ed25519 downgrade attack (ciphers/encryption.go): DecryptLargeStringWithEd25519AAD
fell back to the legacy decryptWithEd25519 for non-X25519 blobs, which ignores AAD and
derives its key from public data — an attacker could forge a legacy blob with a chosen
plaintext, drop it in a scope file under a real recipient's fingerprint, and the deploy
would decrypt it, bypassing the scope/key binding. Now: refuse legacy blobs whenever an
AAD is supplied (the aad-less whole-file/migration path is unchanged).

P0-B RSA multi-chunk malleability (ciphers/encryption.go): every OAEP chunk used the
same label, so a hand-edited scope file could reorder/drop/splice chunks of an RSA
recipient's value and still decrypt to a permuted/truncated plaintext (lint didn't catch
it). Now: the OAEP label binds chunk index + count (chunkLabel), gated on aad != nil so
the legacy store stays byte-identical; empty-chunk-under-AAD is rejected.

P1s: Disallow no longer filters the recipient slice in place (backing-array aliasing);
scope/scopes files are written atomically (temp + rename, no torn writes that would
hard-fail deploys); disallowing the last recipient is refused with a clear error; the
provisioner's SC_KEY_* env scan is constrained to SC_KEY_<valid-scope> so an unrelated
env var is not tried as a decryption key.

Tests: TestAAD_MismatchFails (the core binding invariant, RSA + ed25519),
TestRSAChunkFraming_ReorderTruncateSpliceFails, TestEd25519Downgrade_RejectedUnderAAD,
TestAAD_NilRoundTrip_LegacyCompatible. All secrets + provisioner suites green.

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
… dup, gate, docs, tests)

Closes the should-fix items from the full review panel.

- Cross-stack transplant binding (Gemini P1): scope files now carry a Stack field,
  bound into every value's AAD (stack\0scope\0key) and verified against the parent
  directory on load. A ciphertext cannot be copied between stacks, scopes, or keys.
- Stronger offline gate (Security/Coherence/Codex P1/P2): VerifyConsistency checks
  each chunk against the recipient's key type via ciphers.ValidateCiphertextShape —
  an RSA chunk must be the modulus size, an ed25519 chunk must be an X25519 sealed box
  (magic + min length). Replaces the coarse 16-byte floor; a plaintext value smuggled
  under a fingerprint is rejected offline.
- lint duplicate detection (Codex/Gemini P2):  now flags a key present in
  two scopes of a stack, or in both a scope and the legacy secrets.yaml (mode A) —
  the 'one key, one mode' guarantee, caught before deploy instead of hard-failing there.
- Docs: fixed the spec CLI block (secrets scope <verb>, dropped nonexistent edit/
  updatekeys, --key -> --key-file) + integrity section (stack + chunk-index binding,
  downgrade + shape gates); added a user-facing 'Per-scope secrets' section to
  secrets-management.md; added a v1 supersession note to the keyless-secrets RFC README.
- Tests: real CLI harness (set/get/list/lint/doctor/disallow + reconcile-fail-closed +
  undeclared-scope refusal); SameRecipients; Delete; passphrase-protected-key rejection;
  RSA scope/key transplant; cross-stack transplant (ed25519 + RSA).

All secrets + cmd_secrets + provisioner suites green; module builds; verified end-to-end
(committed file carries stack:, no plaintext, CI-path get via SC_SCOPE_KEY, lint clean).

Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant