resize: EVE-patterned resume tests and SIGKILL chaos harness#15
Conversation
|
@deitch ready for review when you have a chance — see the PR description for what it does. |
2703d42 to
21c8d90
Compare
|
@deitch this is ready for review when you have a chance. To back up the resume-safety claim, I ran an 8-hour SIGKILL chaos soak of Result: 85 iterations, 0 failures. Every killed resize resumed cleanly. Where the 584 kills landed:
The delay did its job — 98 kills landed inside On-disk GPT integrity sampled at each kill (584 samples):
So both tear shapes occur in practice — a torn 32-sector entry array, and a primary/backup ordering skew (backup written first, primary not yet) — and the resume logic heals every one. Header self-CRCs never tear because the header is a single 512-byte sector and its write is atomic, confirmed across all 584 samples. |
79d1850 to
7b0c371
Compare
|
|
||
| ### Chaos / resume soak test | ||
|
|
||
| `TestEVEChaosKill` is a stress test, not a CI gate. It repeatedly runs the full |
There was a problem hiding this comment.
I am not particularly enamoured of calling these "EVE tests". I get that you are testing a layout similar (identical?) to what EVE has. But partitionresizer is no more EVE-specific than diskfs is linuxkit-specific. It was the original driver for the idea, but built to be more generic. Even if it is identical, calling it something else that addresses the general flow would be better.
There was a problem hiding this comment.
Agreed, the name overfits. What's actually under test is a representative mixed-filesystem layout — FAT32 ESP, raw squashfs read-only images, ext4 data — exercised through a shrink-the-tail / grow-the-head flow. EVE was just the layout that motivated it. I'll rename the files and symbols (eve_* → layout/resume, TestEVE* → TestMixedLayout* / TestChaos*, buildEVEFixture → buildMixedLayoutFixture) and reword the README so EVE appears only as the motivating example.
Add a scaled-down sample disk built in-test -- FAT32 ESP, squashfs IMGA/IMGB, a placeholder CONFIG, and an ext4 P9 at index 9 -- and drive two operations through the interrupt/resume harness: Case 1 (TestRunResumeShrink): shrink P9 in place to free space at the end of the disk. Case 2 (TestRunResumeGrow): on the post-shrink disk, grow ESP/IMGA/IMGB into the freed space, copy their content, and finalize with updatePartitions, relocating them to the end of the disk under their original labels/indices. Each interruption point is re-run to completion and the final layout and content are verified (squashfs raw images by hash, the ESP/P9 marker files by read-back). The squashfs images are produced with mksquashfs and written raw, since go-diskfs cannot create squashfs and ext4 on a single 512-byte-sector disk; go-diskfs detects them as unknown and raw-copies them, the same branch it uses for squashfs. The tests are guarded by testing.Short and require mksquashfs. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add TestLayoutStagesDump, which drives the shrink+grow scenario and logs the GPT partition table (sorted by on-disk start, with free-space gaps) at four points: initial layout, after the P9 ext4 shrink, after the *_resized2 partitions are created and filled, and after the final updatePartitions. Useful for seeing how the resize rearranges the disk. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a suite of stress tests on the scaled fixture: - TestChaosKill: runs the real resizer binary doing the grow and SIGKILLs it at random points 1-4 times, then runs once to completion, asserting the final layout, IMG content, and -- critically -- that persist (P9) and CONFIG are never corrupted. Exercises resume from arbitrary interruption (partial GPT writes, half-copied content) and repeated crashes; it is what surfaced the grow-only CLI panic. - TestUntouchedPartitionsByteIntegrity: persist and CONFIG are byte-identical across a grow. - TestInsufficientSpaceIsAtomic: a grow that cannot fit fails without modifying the disk. - TestShrinkBelowUsedAborts: shrinking below the ext4 used size is refused, partition and data intact. - TestCombinedShrinkGrow: a single Run that shrinks P9 and grows ESP/IMGA/ IMGB (larger P9 fixture, since the shrink is rounded up to a whole GB). - TestShrinkPreservesP9Content: P9 filled with many files survives the shrink (block relocation). The fixture builder is parameterized by disk/P9 size and gptDump now reads the disk size from the file. All guarded by testing.Short; the chaos test also needs the resizer binary (built in-test) and mksquashfs. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TestChaosKill now drives the full shrink+grow pattern -- shrink P9 (Case 1) then grow ESP/IMGA/IMGB (Case 2) -- each through random SIGKILLs, so kills can land in every pipeline step (shrinkFilesystems/shrinkPartitions/createPartitions/ copyFilesystems/updatePartitions). It kills the whole process group (so a resize2fs child dies too) and points the resizer's TMPDIR at a scratch dir cleaned after each kill, so an interrupted shrink can't leak its temp copy. Seeds from CHAOS_SEED (else the clock) and logs KILL_STEP=<step> per kill so a long run can show whether kills hit all intermediate points. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
updatePartitions (and createPartitions) are single fast GPT-table writes, so random-timed SIGKILLs essentially never land inside them -- in 8h of chaos, updatePartitions was hit 0 times. Add an optional, build-tagged backend wrapper (`-tags chaos` + RESIZER_GPT_WRITE_DELAY, e.g. "5s") that sleeps after each write touching a GPT metadata sector (LBA 0..33 or the last 33 sectors), widening the window between go-diskfs's backup and primary GPT writes. run.go calls maybeWrapBackend (a no-op in normal builds, so production carries none of this). TestChaosKill builds with -tags chaos and widens its kill-timing window when CHAOS_GPT_DELAY is set. With this, kills reliably reach createPartitions/shrinkPartitions/ updatePartitions, and the resume still recovers cleanly from a kill inside updatePartitions' table write -- confirming the idempotent finalize plus go-diskfs's backup-first GPT write are crash-safe at the table-write level. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7b0c371 to
d36fe90
Compare
After a SIGKILL the chaos test now inspects the on-disk GPT before the next resume run heals it, logging whether the primary (LBA 1) and backup (last LBA) header self-CRCs are valid, whether each header's entry-array CRC matches, and whether the primary and backup entry arrays are byte-identical. Each table spans a header plus 32 entry sectors, so a kill mid-write can leave them disagreeing or an entry array half-written with a failing CRC; this captures exactly that for the steps that write the table -- in particular a kill inside updatePartitions. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add copyDestState: after a kill, classify each grow target's data region against its source to distinguish "nothing written" from a partial or complete copy -- a certainty the KILL_STEP marker alone cannot give for a copyFilesystems interruption. Raw-copied (squashfs) targets compare head and tail bytes to the source; filesystem-copied (FAT32/ext4) targets are checked for the on-disk signature that CreateFilesystem writes before any file data. Gated behind CHAOS_COPY_STATE so it is off unless explicitly requested. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TestChaosKill runs many full resizes under random SIGKILLs; it is a soak/stress test, not a CI gate. Left in the default `go test ./...` it pushed the suite past the 30m CI timeout. Skip it unless RESIZER_CHAOS is set (or CHAOS_GPT_DELAY, which also turns on the GPT-write-delay hook). README now documents the test and how to run it. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The mixed-layout end-to-end tests added here, together with the existing restart-safety resize tests, fill nearly the whole 30m budget (one run landed at ~29.5m of 30m). Raise the timeout to 45m for headroom against runner variance. The chaos soak test remains opt-in and never runs in CI. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d36fe90 to
73ed80b
Compare
|
I cleanup up the commits including removing intermediate references to old names (of scripts etc). No content changes in HEAD. |
The resume and SIGKILL-chaos tests describe their fixture as the "EVE-patterned" A/B-image layout, but labeled the persist partition "P9" and the ESP "ESP" — neither matches the real EVE-OS GPT, which labels the ESP "EFI System" and the persist partition (at index 9) "P3". Identifying partitions by label is exactly what the EVE-side consumer of this tool will do, so the misleading labels are a trap waiting for whoever ports this harness. Rename the GPT partition names, filesystem volume labels, and the associated Go identifiers (p9*→p3*, defaultP9MB→defaultP3MB, shrinkP9→shrinkP3) so the fixture matches production reality. The generic parta/partb/shrinker fixtures are left untouched — they do not claim to be EVE-shaped. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The finalize step gives a relocated partition its original identity, which must include the 64-bit GPT attribute field: consumers may store boot-selection state there (for example, a GPT-priority boot loader's priority/tries/successful bits), so those bits have to survive a resize. Add a focused test covering both the renumber and preserveNumbers paths: it sets a non-zero attribute on the original, leaves the relocated target's attribute at zero, and asserts the target carries the original's value after updatePartitions and that the original is removed. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two fixes surfaced running against a real /dev/nbd0 (the file-based tests cannot reach either path): - Open the disk non-exclusively via OpenFromPathWithExclusive. The e2fsck/resize2fs/fsck.fat calls open the child partitions O_EXCL, which the kernel refuses while the parent disk is held O_EXCL. - Accept e2fsck's "errors corrected" exit status (1/2) as success when running with fixErrors. e2fsck -y returns 1 after recovering a dirty journal / fixing counts -- exactly the power-loss-recovery case this resizer must survive -- which was being treated as a failure. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
How does this look? Ready to go? |
When a partition-table commit reaches the disk but the kernel cannot be made to re-read it live -- the boot disk is busy because the partition being changed is mounted, as when repartitioning the disk we booted from -- go-diskfs now reports disk.ErrReReadDeferred. The table is already on disk, so translate that sentinel at each commit site into the exported ErrRebootToApply, letting a caller reboot to apply the table on the next boot instead of treating a committed-but-not-yet-live table as a failure. Pin go-diskfs to the fork commit that adds ErrReReadDeferred via a replace directive; the replace is temporary and will be dropped once diskfs/go-diskfs#411 merges. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
I'm doing integration testing with EVE and that found an issue or two in go-diskfs fixed in #411, which is now merged. partitionresizer is re-pinned onto it in #20; once that lands we can run this PR through CI. |
Vendor the go-diskfs and partitionresizer changes the offline repartition relies on: a non-exclusive block-device open (so the resizer can shell out to e2fsck/resize2fs on child partitions), per-partition BLKPG re-read when BLKRRPART is busy, and the ErrReReadDeferred / ErrRebootToApply sentinels that let a busy boot disk be repartitioned and applied on the next boot. Both deps are pinned to fork commits via replace, pending the upstream PRs diskfs/go-diskfs#411 and diskfs/partitionresizer#15. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
I guess bring in 411 now? |
|
Done — the re-pin is up as #20. |
Bump github.com/diskfs/go-diskfs to the 2026-06-18 master tip, which now carries the ErrReReadDeferred change from diskfs/go-diskfs#411, and remove the temporary `replace => github.com/eriknordmark/go-diskfs` directive. go mod tidy also prunes the stale go.sum entries left from the fork pin. The replace was only ever meant to pin the pre-merge fork commit and be dropped once #411 landed in master. It was added in diskfs#15, where Claude broke its own rule: a temporary fork-pinning replace must be dropped before the PR merges, never carried into main. diskfs#15 merged with it still in tree, leaving upstream main depending on a personal fork; this commit removes it. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mostly tests + instrumentation (a build-tagged, no-op-by-default backend hook
used only by the chaos test), plus one block-device production fix in
run.go:open the whole disk non-exclusively so child-partition
e2fsck/resize2fscanrun, and accept
e2fsck's "errors corrected" exit codes under fix-errors. Thatfix is what introduces the go-diskfs dependency above.
What
Realistic, EVE-shaped resize tests plus a SIGKILL chaos harness that exercises
resume safety under crash injection.
eve_resume_test.go): a scaled-down EVE disk(FAT32 ESP, squashfs IMGA/IMGB, CONFIG, ext4 P9 at index 9) driven through the
resume/idempotency harness — a Case-1 in-place P9 shrink and a Case-2 grow of
ESP/IMGA/IMGB into the freed space — plus byte-integrity, atomic-failure,
shrink-below-used, and combined shrink+grow invariants, with per-stage GPT
table logging.
eve_stress_test.go): runs the full two-step EVE resize asa subprocess and group-SIGKILLs it at random points across all five pipeline
steps, then re-runs to completion and asserts the result matches an
uninterrupted run; records which step each kill hit.
delaybackend_*.go,-tags chaos;run.gomaybeWrapBackend): a no-op in normal builds. Under the chaos tag it delaysaround GPT-sector writes so kills can land inside the otherwise-instantaneous
updatePartitions/createPartitionstable writes.gpt_integrity_test.go): after each kill, checksprimary/backup header self-CRCs, entry-array CRCs, and primary↔backup entry
equality, so torn tables and table divergence are observed — and shown to be
healed by the resume.
copyprogress_test.go, opt-in): classifies a growtarget's data region as empty/partial/complete after a kill.
Results
With the GPT-write delay, kills reach all five pipeline steps including
updatePartitions; the resume recovers from every observed torn-entry-array andprimary↔backup-mismatch state.