Skip to content

Fix rustup cache#1069

Merged
AloeareV merged 12 commits intodevfrom
fix_rustup_cache
Apr 28, 2026
Merged

Fix rustup cache#1069
AloeareV merged 12 commits intodevfrom
fix_rustup_cache

Conversation

@zancas
Copy link
Copy Markdown
Member

@zancas zancas commented Apr 27, 2026

Fixes unnecessary download of rustc components.

pua and others added 2 commits April 27, 2026 12:44
The `zaino-rustup` named volume (added in 31bf4df) was intended to
cache the rustup toolchain across container runs, avoiding 30–60s of
re-downloads when `rust-toolchain.toml`'s `channel = "1.92"` resolved
to a newer point-release than the one in the image. But mounting a
volume over `/usr/local/rustup` shadows the toolchain already present
in the `rust:1.92-trixie` base image — so on first run (or after any
volume cleanup) all six components are downloaded from scratch into an
empty volume, which is the very problem the volume was meant to solve.

Fix the root cause instead:

  - Containerfile: remove the redundant `curl rustup.rs` + `rustup
    update stable` (the base image already ships both), and add
    `rustup component add rustfmt clippy` so the image matches
    `rust-toolchain.toml` exactly.
  - Makefile.toml: remove `zaino-rustup` from `init-podman-volumes`
    and from all `podman run` invocations.

The image now contains the complete toolchain at build time. No volume
is needed, no first-run download penalty, and no skew from
containerless or CI runs.

(cherry picked from commit 76d01ebb8f157e9f4687b6e241864fa43e565fbe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  The `zaino-rustup` named volume (added in 31bf4df) was intended to
  cache the rustup toolchain across container runs, avoiding 30–60s of
  re-downloads when `rust-toolchain.toml`'s `channel = "1.92"` resolved
  to a newer point-release than the one in the image. But mounting a
  volume over `/usr/local/rustup` shadows the toolchain already present
  in the `rust:1.92-trixie` base image — so on first run (or after any
  volume cleanup) all six components are downloaded from scratch into an
  empty volume, which is the very problem the volume was meant to solve.

  Caching the wrong layer also masked a more fundamental mismatch: the
  `rust:1.92-trixie` base image installs its toolchain under the
  point-release identity `1.92.0-x86_64-unknown-linux-gnu`, but
  `rust-toolchain.toml` declares `channel = "1.92"`, which rustup
  resolves to a *different* identity, `1.92-x86_64-unknown-linux-gnu`.
  At container startup rustup sees the channel-style name missing and
  re-downloads the entire toolchain — regardless of any volume mount,
  and regardless of which components were added to the base image's
  default toolchain.

  Fix the root cause: install the toolchain under the same name
  `rust-toolchain.toml` will request, with components already attached.

    - Containerfile: replace `curl rustup.rs` + `rustup update stable`
      (the base image already ships rustup) with `rustup toolchain
      install 1.92 --profile minimal --component rustfmt --component
      clippy && rustup default 1.92`.
    - Makefile.toml: remove `zaino-rustup` from `init-podman-volumes`
      and from all `podman run` invocations.

  The image now contains the complete toolchain at build time, under
  the correct channel identity. No volume is needed, no first-run
  download penalty, and no skew from containerless or CI runs.

  (cherry picked from commit 76d01ebb8f157e9f4687b6e241864fa43e565fbe,
  with the Containerfile amended to install the toolchain under the
  channel name rust-toolchain.toml resolves to)

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas zancas requested a review from nachog00 April 27, 2026 20:12
@zancas zancas marked this pull request as draft April 27, 2026 20:14
Comment thread integration-tests/test_environment/Containerfile Outdated
zancas added 6 commits April 27, 2026 13:23
  Three build scripts were combining to recompile the whole workspace
  even when sources were byte-identical:

    - zaino-state/build.rs and zaino-serve/build.rs emitted BUILD_DATE
      from wall-clock time. Without `cargo:rerun-if-changed=` directives
      cargo defaulted to rerunning the script on any package-file change,
      and each rerun produced a different env value, forcing rustc to
      rebuild the crate (and everything downstream).
    - zaino-proto/build.rs's `copy_generated` always wrote the generated
      src/proto/*.rs files via `fs::copy`, bumping their mtime even when
      bytes were unchanged. Since those files live in the package source
      tree, the next cargo invocation saw fresher mtimes than its
      fingerprint and re-invalidated the crate.

  Fix all three:
    - zaino-state, zaino-serve: BUILD_DATE honors SOURCE_DATE_EPOCH or
      falls back to "unknown" — never wall-clock. Add explicit
      `cargo:rerun-if-changed=build.rs`, `=../../.git/HEAD`, and
      `cargo:rerun-if-env-changed=SOURCE_DATE_EPOCH` so the script only
      reruns when something it actually depends on changed.
    - zaino-proto: `copy_generated` skips the write when bytes already
      match, preserving mtime. Add `cargo:rerun-if-changed=build.rs` and
      one entry per .proto input.

  After this, `cargo nextest run` on an unmodified tree completes with
  no `Compiling` lines.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Three build scripts combined to recompile the whole workspace even
  when sources were byte-identical:

    - zaino-serve/build.rs emitted GIT_COMMIT, BRANCH, BUILD_DATE,
      BUILD_USER, VERSION — none of which any zaino-serve source file
      reads. The script is removed entirely; this also eliminates one
      of the wall-clock-derived `cargo:rustc-env` values.
    - zaino-state/build.rs emitted BUILD_DATE from SystemTime::now() with
      no rerun-if-changed declarations. Each rerun produced a different
      env value, forcing rustc to rebuild the crate (and everything
      downstream). BUILD_DATE now honors SOURCE_DATE_EPOCH or falls back
      to "unknown" — never wall-clock. Rerun guards added for build.rs,
      .git/HEAD, and SOURCE_DATE_EPOCH.
    - zaino-proto/build.rs's copy_generated always wrote the generated
      src/proto/*.rs files via fs::copy, bumping mtime even when bytes
      were unchanged. Since those files live in the package source tree,
      cargo saw fresher mtimes than its fingerprint and re-invalidated
      the crate. copy_generated now skips the write when bytes match.
      Rerun guards added for build.rs and each .proto input.

  After this, `cargo nextest run` on an unmodified tree completes with
  no `Compiling` lines.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Address review feedback (#1069): the rustup install line hardcoded
  "1.92", duplicating .env.testing-artifacts' RUST_VERSION (already the
  canonical source feeding the `FROM rust:${RUST_VERSION}-trixie` clause).

  Substitute ${RUST_VERSION} for both literal "1.92" occurrences and
  redeclare `ARG RUST_VERSION` inside the `base` stage — pre-FROM ARGs
  flow into FROM clauses but not into stage-internal RUN/LABEL
  substitutions, so without the redeclaration the variable would expand
  to empty.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Two files hardcoded the Rust toolchain version:

    - .env.testing-artifacts: RUST_VERSION=1.92, sourced by Makefile.toml
      via env_files and by GitHub workflows via `source`
    - Dockerfile (top-level, production): ARG RUST_VERSION=1.86.0 — drifted
      six minor versions behind the workspace's rust-toolchain.toml channel

  Both consumers now derive RUST_VERSION from rust-toolchain.toml's
  `channel` via tools/scripts/get-rust-version.sh (added in #1046,
  previously unwired). The script rejects non-pinned channels (stable,
  nightly, dated pins) so the CI image tag stays reproducible.

  Changes:
    - .env.testing-artifacts: drop RUST_VERSION; explanatory comment in its
      place
    - Makefile.toml: [env] block now scripts RUST_VERSION from the helper
    - .github/workflows/{build-n-push-ci-image,compute-tag}.yml: derive
      RUST_VERSION via the helper instead of sourcing it from the .env
    - Dockerfile (top-level): drop the =1.86.0 default; ARG RUST_VERSION
      is now required at build time, with comment pointing to the helper.
      `docker build .` without --build-arg will now fail loudly at the FROM
      clause rather than silently using a stale version.

  Side effect: anyone building the production Dockerfile must now pass
  `--build-arg RUST_VERSION=$(./tools/scripts/get-rust-version.sh)`. The
  production image's effective Rust version moves 1.86.0 → 1.92 — the
  prior literal was stale relative to the rest of the workspace.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  Pin rust-toolchain.toml to 1.95.0 (was 1.92). Via the unification
  introduced in the previous commit, the bump propagates automatically:
  tools/scripts/get-rust-version.sh now emits 1.95.0, which feeds the
  cargo-make [env] block, get-ci-image-tag.sh, the GitHub workflows,
  and the production Dockerfile build-arg.

  Also expand build-n-push-ci-image.yaml's path filter to include
  rust-toolchain.toml and tools/scripts/get-rust-version.sh. Without
  this, toolchain bumps would not trigger an image rebuild — RUST_VERSION
  no longer lives in .env.testing-artifacts (the previous trigger path),
  so the workflow had no way to notice this change.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    - zaino-proto: drop tonic-build and prost-build from
      [build-dependencies]. The build script imports only
      tonic_prost_build; the other two were leftovers from before the
      tonic-prost-build crate split and were never used directly.
    - zaino-serve: remove the [build-dependencies] section entirely
      (whoami). The build.rs that consumed it was deleted in the
      preceding commit because zaino-serve no source file ever read
      its emitted env vars.
    - zaino-state: drop cargo-lock from [build-dependencies]. The
      build script does not import or reference it.

  Cargo.lock shrinks by ~95 lines as a result — these stale entries
  were pulling real transitive deps into every workspace build.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas zancas marked this pull request as ready for review April 27, 2026 22:18
zancas added 3 commits April 27, 2026 15:57
  zaino-container-target volume on /.../zaino/target. Under some
  podman/runc/buildah combinations, if host-side target/ does not exist
  when that overlay is set up, the directory is created under a
  uidmap-escaped UID (typically UID 100000 — rootless podman's container
  UID 1) instead of the current user. Subsequent host-side `cargo` then
  fails with EACCES because target/debug/ cannot be created.

  Add to init-podman-volumes (already a dependency of container-test and
  ensure-image-exists):

    - Pre-create target/ as the current user before podman touches the
      bind-mount source, so the leak path can never win the race.
    - Recover if target/ already exists but is unwritable: rm -rf works
      because the parent (the project root) is owned by the current user
      even when target/ itself is not.

  I was unable to reproduce the original leak from any current code path
  under the latest Containerfile/Makefile.toml — every reproduction
  produced nattyb-owned target/. Likely vestige of an older image or
  manual `podman run` flow. Treating this as a defense-in-depth guard
  rather than a fix for a confirmed live bug.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  branch via workflow_dispatch, and to fix a latent bug introduced when
  the rust-version unification dropped the production Dockerfile's
  ARG RUST_VERSION default:

    - workflow_dispatch.inputs.push: boolean, default true. Lets a
      dispatched run skip the DockerHub publish step for verification
      runs without modifying the workflow per-test.
    - PUSH_IMAGES env now derived from event + input: tag-push events
      still publish unchanged; manual dispatches honor the input.
    - docker/metadata-action gets `type=ref,event=branch` and
      `type=sha,prefix=,format=short` tag fallbacks. Without these, a
      non-tag ref produced an empty tags list and `docker buildx build`
      aborted with "tag is needed when pushing to registry" before any
      other step ran — also the cause of recent failed manual dispatches.
    - Add a step that extracts RUST_VERSION via tools/scripts/get-rust-version.sh
      and pass it to both Build steps as a build-arg. The unification
      commit removed the `=1.86.0` default from the production
      Dockerfile, so without this the next real release tag push would
      have failed at `FROM rust:${RUST_VERSION}-bookworm`.
    - create-release job gated with `if: startsWith(github.ref, 'refs/tags/')`
      so a feature-branch dispatch can never accidentally create a
      GitHub Release.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas zancas assigned nachog00 and zancas and unassigned nachog00 Apr 28, 2026
@zancas zancas requested review from AloeareV and nachog00 and removed request for nachog00 April 28, 2026 16:15
Copy link
Copy Markdown
Contributor

@AloeareV AloeareV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to now be caching rustup!

@AloeareV AloeareV merged commit 93b0e54 into dev Apr 28, 2026
9 of 16 checks passed
@zancas zancas deleted the fix_rustup_cache branch April 28, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants