Skip to content

resize: EVE-patterned resume tests and SIGKILL chaos harness#15

Merged
deitch merged 13 commits into
diskfs:mainfrom
eriknordmark:updatepartitions-idempotent
Jun 19, 2026
Merged

resize: EVE-patterned resume tests and SIGKILL chaos harness#15
deitch merged 13 commits into
diskfs:mainfrom
eriknordmark:updatepartitions-idempotent

Conversation

@eriknordmark

@eriknordmark eriknordmark commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Mostly tests + instrumentation (a build-tagged, no-op-by-default backend hook
used only by the chaos test), plus one block-device production fix in run.go:
open the whole disk non-exclusively so child-partition e2fsck/resize2fs can
run, and accept e2fsck's "errors corrected" exit codes under fix-errors. That
fix is what introduces the go-diskfs dependency above.

What

Realistic, EVE-shaped resize tests plus a SIGKILL chaos harness that exercises
resume safety under crash injection.

  • EVE-patterned layout tests (eve_resume_test.go): a scaled-down EVE disk
    (FAT32 ESP, squashfs IMGA/IMGB, CONFIG, ext4 P9 at index 9) driven through the
    resume/idempotency harness — a Case-1 in-place P9 shrink and a Case-2 grow of
    ESP/IMGA/IMGB into the freed space — plus byte-integrity, atomic-failure,
    shrink-below-used, and combined shrink+grow invariants, with per-stage GPT
    table logging.
  • SIGKILL chaos (eve_stress_test.go): runs the full two-step EVE resize as
    a subprocess and group-SIGKILLs it at random points across all five pipeline
    steps, then re-runs to completion and asserts the result matches an
    uninterrupted run; records which step each kill hit.
  • GPT-write-delay hook (delaybackend_*.go, -tags chaos; run.go
    maybeWrapBackend): a no-op in normal builds. Under the chaos tag it delays
    around GPT-sector writes so kills can land inside the otherwise-instantaneous
    updatePartitions/createPartitions table writes.
  • Post-kill GPT integrity (gpt_integrity_test.go): after each kill, checks
    primary/backup header self-CRCs, entry-array CRCs, and primary↔backup entry
    equality, so torn tables and table divergence are observed — and shown to be
    healed by the resume.
  • Copy-dest state (copyprogress_test.go, opt-in): classifies a grow
    target's data region as empty/partial/complete after a kill.

Results

With the GPT-write delay, kills reach all five pipeline steps including
updatePartitions; the resume recovers from every observed torn-entry-array and
primary↔backup-mismatch state.

@eriknordmark eriknordmark marked this pull request as ready for review June 10, 2026 18:13
@eriknordmark

eriknordmark commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@deitch ready for review when you have a chance — see the PR description for what it does.

@eriknordmark eriknordmark marked this pull request as draft June 11, 2026 13:42
@eriknordmark eriknordmark force-pushed the updatepartitions-idempotent branch from 2703d42 to 21c8d90 Compare June 11, 2026 17:32
@eriknordmark eriknordmark changed the title resize: idempotent updatePartitions finalize step resize: EVE-patterned resume tests and SIGKILL chaos harness Jun 11, 2026
@eriknordmark eriknordmark marked this pull request as ready for review June 11, 2026 21:46
@eriknordmark

Copy link
Copy Markdown
Contributor Author

@deitch this is ready for review when you have a chance.

To back up the resume-safety claim, I ran an 8-hour SIGKILL chaos soak of TestEVEChaosKill with the GPT write-delay hook enabled (CHAOS_GPT_DELAY=5s), which widens the window around the otherwise-instantaneous GPT-table writes so kills can land inside them. Each iteration runs the full EVE-style two-step resize (shrink P9, then grow ESP/IMGA/IMGB) as a subprocess, SIGKILLs it at a random point, then re-runs to completion and asserts the result is byte-identical to an uninterrupted resize.

Result: 85 iterations, 0 failures. Every killed resize resumed cleanly.

Where the 584 kills landed:

Step Kills
shrinkPartitions 186
preflight 167
updatePartitions 98
createPartitions 92
copyFilesystems 28
shrinkFilesystems 13

The delay did its job — 98 kills landed inside updatePartitions and 92 inside createPartitions, i.e. mid-GPT-write.

On-disk GPT integrity sampled at each kill (584 samples):

State Count
Header self-CRC failure (primary or backup) 0
Torn primary entry array (CRC mismatch) 66
Torn backup entry array (CRC mismatch) 64
Primary↔backup entry mismatch 121
Clean 395

So both tear shapes occur in practice — a torn 32-sector entry array, and a primary/backup ordering skew (backup written first, primary not yet) — and the resume logic heals every one. Header self-CRCs never tear because the header is a single 512-byte sector and its write is atomic, confirmed across all 584 samples.

@eriknordmark eriknordmark force-pushed the updatepartitions-idempotent branch from 79d1850 to 7b0c371 Compare June 12, 2026 06:14

@deitch deitch left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions.

Comment thread README.md
Comment thread README.md Outdated

### Chaos / resume soak test

`TestEVEChaosKill` is a stress test, not a CI gate. It repeatedly runs the full

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not particularly enamoured of calling these "EVE tests". I get that you are testing a layout similar (identical?) to what EVE has. But partitionresizer is no more EVE-specific than diskfs is linuxkit-specific. It was the original driver for the idea, but built to be more generic. Even if it is identical, calling it something else that addresses the general flow would be better.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the name overfits. What's actually under test is a representative mixed-filesystem layout — FAT32 ESP, raw squashfs read-only images, ext4 data — exercised through a shrink-the-tail / grow-the-head flow. EVE was just the layout that motivated it. I'll rename the files and symbols (eve_* → layout/resume, TestEVE*TestMixedLayout* / TestChaos*, buildEVEFixturebuildMixedLayoutFixture) and reword the README so EVE appears only as the motivating example.

Comment thread copyprogress_test.go Outdated
eriknordmark and others added 5 commits June 12, 2026 14:25
Add a scaled-down sample disk built in-test -- FAT32 ESP, squashfs
IMGA/IMGB, a placeholder CONFIG, and an ext4 P9 at index 9 -- and drive two
operations through the interrupt/resume harness:

  Case 1 (TestRunResumeShrink): shrink P9 in place to free space at the end
  of the disk.
  Case 2 (TestRunResumeGrow): on the post-shrink disk, grow ESP/IMGA/IMGB
  into the freed space, copy their content, and finalize with updatePartitions,
  relocating them to the end of the disk under their original labels/indices.

Each interruption point is re-run to completion and the final layout and
content are verified (squashfs raw images by hash, the ESP/P9 marker files by
read-back). The squashfs images are produced with mksquashfs and written raw,
since go-diskfs cannot create squashfs and ext4 on a single 512-byte-sector
disk; go-diskfs detects them as unknown and raw-copies them, the same branch it
uses for squashfs. The tests are guarded by testing.Short and require
mksquashfs.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add TestLayoutStagesDump, which drives the shrink+grow scenario and logs the
GPT partition table (sorted by on-disk start, with free-space gaps) at four
points: initial layout, after the P9 ext4 shrink, after the *_resized2
partitions are created and filled, and after the final updatePartitions. Useful
for seeing how the resize rearranges the disk.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a suite of stress tests on the scaled fixture:

- TestChaosKill: runs the real resizer binary doing the grow and SIGKILLs it
  at random points 1-4 times, then runs once to completion, asserting the final
  layout, IMG content, and -- critically -- that persist (P9) and CONFIG are
  never corrupted. Exercises resume from arbitrary interruption (partial GPT
  writes, half-copied content) and repeated crashes; it is what surfaced the
  grow-only CLI panic.
- TestUntouchedPartitionsByteIntegrity: persist and CONFIG are byte-identical
  across a grow.
- TestInsufficientSpaceIsAtomic: a grow that cannot fit fails without
  modifying the disk.
- TestShrinkBelowUsedAborts: shrinking below the ext4 used size is refused,
  partition and data intact.
- TestCombinedShrinkGrow: a single Run that shrinks P9 and grows ESP/IMGA/
  IMGB (larger P9 fixture, since the shrink is rounded up to a whole GB).
- TestShrinkPreservesP9Content: P9 filled with many files survives the shrink
  (block relocation).

The fixture builder is parameterized by disk/P9 size and gptDump now reads the
disk size from the file. All guarded by testing.Short; the chaos test also
needs the resizer binary (built in-test) and mksquashfs.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TestChaosKill now drives the full shrink+grow pattern -- shrink P9 (Case 1) then
grow ESP/IMGA/IMGB (Case 2) -- each through random SIGKILLs, so kills can land
in every pipeline step (shrinkFilesystems/shrinkPartitions/createPartitions/
copyFilesystems/updatePartitions). It kills the whole process group (so a
resize2fs child dies too) and points the resizer's TMPDIR at a scratch dir
cleaned after each kill, so an interrupted shrink can't leak its temp copy.
Seeds from CHAOS_SEED (else the clock) and logs KILL_STEP=<step> per kill so a
long run can show whether kills hit all intermediate points.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
updatePartitions (and createPartitions) are single fast GPT-table writes, so
random-timed SIGKILLs essentially never land inside them -- in 8h of chaos,
updatePartitions was hit 0 times. Add an optional, build-tagged backend wrapper
(`-tags chaos` + RESIZER_GPT_WRITE_DELAY, e.g. "5s") that sleeps after each
write touching a GPT metadata sector (LBA 0..33 or the last 33 sectors),
widening the window between go-diskfs's backup and primary GPT writes. run.go
calls maybeWrapBackend (a no-op in normal builds, so production carries none of
this). TestChaosKill builds with -tags chaos and widens its kill-timing
window when CHAOS_GPT_DELAY is set.

With this, kills reliably reach createPartitions/shrinkPartitions/
updatePartitions, and the resume still recovers cleanly from a kill inside
updatePartitions' table write -- confirming the idempotent finalize plus
go-diskfs's backup-first GPT write are crash-safe at the table-write level.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@eriknordmark eriknordmark force-pushed the updatepartitions-idempotent branch from 7b0c371 to d36fe90 Compare June 12, 2026 14:46
eriknordmark and others added 4 commits June 12, 2026 18:01
After a SIGKILL the chaos test now inspects the on-disk GPT before the
next resume run heals it, logging whether the primary (LBA 1) and backup
(last LBA) header self-CRCs are valid, whether each header's entry-array
CRC matches, and whether the primary and backup entry arrays are
byte-identical. Each table spans a header plus 32 entry sectors, so a kill
mid-write can leave them disagreeing or an entry array half-written with a
failing CRC; this captures exactly that for the steps that write the
table -- in particular a kill inside updatePartitions.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add copyDestState: after a kill, classify each grow target's data
region against its source to distinguish "nothing written" from a
partial or complete copy -- a certainty the KILL_STEP marker alone
cannot give for a copyFilesystems interruption. Raw-copied (squashfs)
targets compare head and tail bytes to the source; filesystem-copied
(FAT32/ext4) targets are checked for the on-disk signature that
CreateFilesystem writes before any file data. Gated behind
CHAOS_COPY_STATE so it is off unless explicitly requested.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TestChaosKill runs many full resizes under random SIGKILLs; it is a
soak/stress test, not a CI gate. Left in the default `go test ./...` it pushed
the suite past the 30m CI timeout. Skip it unless RESIZER_CHAOS is set (or
CHAOS_GPT_DELAY, which also turns on the GPT-write-delay hook). README now
documents the test and how to run it.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The mixed-layout end-to-end tests added here, together with the existing
restart-safety resize tests, fill nearly the whole 30m budget (one run landed
at ~29.5m of 30m). Raise the timeout to 45m for headroom against runner
variance. The chaos soak test remains opt-in and never runs in CI.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@eriknordmark eriknordmark force-pushed the updatepartitions-idempotent branch from d36fe90 to 73ed80b Compare June 12, 2026 16:08
@eriknordmark

Copy link
Copy Markdown
Contributor Author

I cleanup up the commits including removing intermediate references to old names (of scripts etc). No content changes in HEAD.

@eriknordmark eriknordmark requested a review from deitch June 12, 2026 16:14
eriknordmark and others added 3 commits June 15, 2026 23:45
The resume and SIGKILL-chaos tests describe their fixture as the
"EVE-patterned" A/B-image layout, but labeled the persist partition
"P9" and the ESP "ESP" — neither matches the real EVE-OS GPT, which
labels the ESP "EFI System" and the persist partition (at index 9)
"P3". Identifying partitions by label is exactly what the EVE-side
consumer of this tool will do, so the misleading labels are a trap
waiting for whoever ports this harness.

Rename the GPT partition names, filesystem volume labels, and the
associated Go identifiers (p9*→p3*, defaultP9MB→defaultP3MB,
shrinkP9→shrinkP3) so the fixture matches production reality. The
generic parta/partb/shrinker fixtures are left untouched — they do
not claim to be EVE-shaped.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The finalize step gives a relocated partition its original identity,
which must include the 64-bit GPT attribute field: consumers may store
boot-selection state there (for example, a GPT-priority boot loader's
priority/tries/successful bits), so those bits have to survive a resize.

Add a focused test covering both the renumber and preserveNumbers paths:
it sets a non-zero attribute on the original, leaves the relocated
target's attribute at zero, and asserts the target carries the
original's value after updatePartitions and that the original is removed.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two fixes surfaced running against a real /dev/nbd0 (the file-based
tests cannot reach either path):

- Open the disk non-exclusively via OpenFromPathWithExclusive. The
  e2fsck/resize2fs/fsck.fat calls open the child partitions O_EXCL, which
  the kernel refuses while the parent disk is held O_EXCL.

- Accept e2fsck's "errors corrected" exit status (1/2) as success when
  running with fixErrors. e2fsck -y returns 1 after recovering a dirty
  journal / fixing counts -- exactly the power-loss-recovery case this
  resizer must survive -- which was being treated as a failure.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@deitch

deitch commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

How does this look? Ready to go?

When a partition-table commit reaches the disk but the kernel cannot be
made to re-read it live -- the boot disk is busy because the partition
being changed is mounted, as when repartitioning the disk we booted from
-- go-diskfs now reports disk.ErrReReadDeferred. The table is already on
disk, so translate that sentinel at each commit site into the exported
ErrRebootToApply, letting a caller reboot to apply the table on the next
boot instead of treating a committed-but-not-yet-live table as a failure.

Pin go-diskfs to the fork commit that adds ErrReReadDeferred via a replace
directive; the replace is temporary and will be dropped once
diskfs/go-diskfs#411 merges.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@eriknordmark

eriknordmark commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

I'm doing integration testing with EVE and that found an issue or two in go-diskfs fixed in #411, which is now merged. partitionresizer is re-pinned onto it in #20; once that lands we can run this PR through CI.

eriknordmark added a commit to eriknordmark/eve that referenced this pull request Jun 18, 2026
Vendor the go-diskfs and partitionresizer changes the offline repartition
relies on: a non-exclusive block-device open (so the resizer can shell out
to e2fsck/resize2fs on child partitions), per-partition BLKPG re-read when
BLKRRPART is busy, and the ErrReReadDeferred / ErrRebootToApply sentinels
that let a busy boot disk be repartitioned and applied on the next boot.

Both deps are pinned to fork commits via replace, pending the upstream PRs
diskfs/go-diskfs#411 and diskfs/partitionresizer#15.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@deitch

deitch commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

I guess bring in 411 now?

@eriknordmark

Copy link
Copy Markdown
Contributor Author

Done — the re-pin is up as #20.

@deitch deitch merged commit f93d516 into diskfs:main Jun 19, 2026
9 checks passed
eriknordmark added a commit to eriknordmark/partitionresizer that referenced this pull request Jun 19, 2026
Bump github.com/diskfs/go-diskfs to the 2026-06-18 master tip, which now
carries the ErrReReadDeferred change from diskfs/go-diskfs#411, and remove
the temporary `replace => github.com/eriknordmark/go-diskfs` directive. go
mod tidy also prunes the stale go.sum entries left from the fork pin.

The replace was only ever meant to pin the pre-merge fork commit and be
dropped once #411 landed in master. It was added in diskfs#15, where Claude broke
its own rule: a temporary fork-pinning replace must be dropped before the
PR merges, never carried into main. diskfs#15 merged with it still in tree,
leaving upstream main depending on a personal fork; this commit removes it.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants