Skip to content

chore(build): upgrade to Debian Trixie + Python 3.13, drop Balena base images#2779

Merged
vpetersson merged 18 commits intomasterfrom
trixie-upgrade
Apr 30, 2026
Merged

chore(build): upgrade to Debian Trixie + Python 3.13, drop Balena base images#2779
vpetersson merged 18 commits intomasterfrom
trixie-upgrade

Conversation

@vpetersson
Copy link
Copy Markdown
Contributor

@vpetersson vpetersson commented Apr 29, 2026

Summary

  • Move every container off balenalib/raspberrypi*-debian:bookworm onto vanilla debian:trixie. Balena hasn't published a trixie tag for any of the RPi repos and last refreshed in May 2025, so staying on Balena = staying on bookworm indefinitely. What Balena added (UDEV=off entrypoint, cross-build-* helpers, balena-idle, io.balena.* labels) is unused or replaceable in our flow.
  • Retire pi1 (no linux/arm/v6 in vanilla Debian) and pi4 32-bit (Pi 4 always has a 64-bit path; this drops the libssl1.1 / libgst-dev / libsqlite0-dev Qt 5 mess). Surviving build matrix: pi2, pi3, pi4-64, pi5, x86.
  • For the surviving 32-bit boards (pi2, pi3), Dockerfile.base.j2 conditionally writes Deb822 .sources entries for archive.raspberrypi.org/debian trixie main and archive.raspbian.org/raspbian trixie firmware and bootstraps the keyrings via the .deb packages (extracted with dpkg-deb -x; their key bindings are Sequoia-policy-compliant, unlike the standalone .public.key files). Architectures: armhf is pinned per source so apt doesn't query Pi mirrors on 64-bit/x86 builds.
  • Trixie package renames fixed: libgles2-mesalibgles2, ttf-wqy-zenheifonts-wqy-zenhei, libpng16-16libpng16-16t64 (time64 transition; armhf has no Provides: fallback like amd64). Drop the gone-from-trixie Qt 5-only libgst-dev / libsqlite0-dev / libsrtp0-dev / libssl1.1 / git-core; their modern equivalents (libgstreamer1.0-dev, libsqlite3-dev, libsrtp2-dev, libssl3, git) are already in the lists.
  • Python 3.13 (Trixie's default) everywhere: pyproject.toml requires-python>=3.13 + mypy python_version=3.13, ruff.toml target-version=py313, .python-version, uv.lock (only material change: async-timeout dropped — its marker was python<3.11), uv-builder.j2 UV_PYTHON=/usr/bin/python3, Dockerfile.dev FROM python:3.13-trixie, bin/install.sh --python ">=3.13", every CI setup-python pin.
  • Cleanup falling out of the matrix shrink: cache_scope / device_type / version_suffix pi4 + arm64 → pi4-64 re-mapping is gone (board is self-identifying); the Balena-specific c_rehash workaround is gone; uv-builder.j2's arm/v6 + arm/v8 branches are gone (only arm/v7 remains as the 32-bit ARM target); webview/build_qt5.sh's pi1/pi4 branches and the Debian-version-gate are gone; docker/Dockerfile.celery deleted (left behind from refactor(docker): drop celery image, restore base apt layer dedup #2776).

Out-of-band prereq before any device picks up a new viewer build: cut a new WebView-v* release tag (so the existing build-webview.yaml workflow produces webview-{ver}-trixie-{board}.tar.gz and qt5-5.15.14-trixie-{pi2,pi3}.tar.gz for the surviving boards) and bump WEBVIEW_VERSION in tools/image_builder/utils.py:143. The webview Dockerfiles already point at debian:trixie, so triggering the workflow on the new tag should produce the artifacts.

Test plan

  • ruff check + ruff format --check on tools/image_builder/ clean
  • uv lock regen with Python 3.13 (only diff: async-timeout dropped)
  • Dockerfile generation runs cleanly for all 5 surviving boards (x86, pi5, pi4-64, pi3, pi2)
  • x86 server imagedocker buildx --platform=linux/amd64 --load succeeds; running container reports Debian 13.4 + Python 3.13.5; Django 5.2.13 / channels 4.3.1 / uvicorn 0.32.1 import OK
  • x86 redis imagedocker buildx --load succeeds; Redis 8.0.2 on trixie
  • pi3 server image under linux/arm/v7 qemu — full build green; Pi sources bootstrap works; libraspberrypi0 installs from raspbian/firmware/armhf with /opt/vc/lib/{libbcm_host,libbrcmEGL,...} present
  • pi3 viewer image under linux/arm/v7 qemu — 147s apt layer green end-to-end through libpulse-dev, libgstreamer1.0-dev, libsdl2-dev, libpng16-16t64 etc.; build proceeds through uv-builder + main stages and stops only at the (deliberately not-yet-published) WebView qt5 tarball fetch
  • WebView trixie release cut + WEBVIEW_VERSION bumped (separate PR, prereq for end-to-end viewer image build)
  • Real-hardware smoke post-merge — Pi 5 / Pi 4-64 (Qt 6 mpv viewer flow), Pi 3 / Pi 2 (Qt 5 webview), x86 — confirm vcgencmd works inside the 32-bit Pi viewer containers and asset playback is unchanged

🤖 Generated with Claude Code

…e images

Move every container off `balenalib/raspberrypi*-debian:bookworm` (Balena
hasn't published a `trixie` tag on any of those repos and last refreshed
in May 2025) onto vanilla `debian:trixie`. Pi 1 and 32-bit Pi 4 are
retired at the same time — Pi 1 has no `linux/arm/v6` variant in upstream
Debian, and Pi 4 always has a 64-bit path that avoids the messy
`libssl1.1` / `libgst-dev` / `libsqlite0-dev` Qt 5 deps. Surviving build
matrix: pi2, pi3, pi4-64, pi5, x86.

For the surviving 32-bit boards (pi2, pi3) the legacy Broadcom userland
(libraspberrypi0 → /opt/vc/lib/{libbcm_host,libmmal,libvchiq_arm}) is
still required at runtime by the Qt 5 webview. Trixie's
archive.raspberrypi.org/debian/main no longer ships those packages
(replaced by raspi-utils + libdtovl0, which actively break
libraspberrypi0), so Dockerfile.base.j2 conditionally writes Deb822
.sources entries pointing at archive.raspberrypi.org/debian trixie main
and archive.raspbian.org/raspbian trixie firmware (where the legacy
Raspbian builds of libraspberrypi0 still live, armhf only). The
.deb-form raspberrypi-archive-keyring + raspbian-archive-keyring packages
are extracted with `dpkg-deb -x` (their bundled keys carry trixie-policy-
compliant binding signatures, unlike the standalone .public.key files
which fail Sequoia/sqv's post-2026-02-01 SHA-1 ban). Architectures: armhf
on each .sources file keeps apt from querying the Pi mirrors for the
arm64 / x86 builds.

Trixie package renames also fixed: libgles2-mesa → libgles2,
ttf-wqy-zenhei → fonts-wqy-zenhei, libpng16-16 → libpng16-16t64 (time64
transition; armhf has no `Provides:` fallback like amd64 does), and the
Qt 5-only libgst-dev / libsqlite0-dev / libsrtp0-dev / libssl1.1 are
dropped (libgstreamer1.0-dev, libsqlite3-dev, libsrtp2-dev, libssl3 take
their place — first added explicitly, the rest already in the main
list). The transitional `git-core` is gone in trixie; `git` covers it.

Python 3.13 (Trixie's default) replaces the 3.11 pin everywhere:
pyproject.toml requires-python and mypy python_version, ruff.toml
target-version, .python-version, uv.lock (regenerated; only diff is
async-timeout dropped — its marker was python<3.11), uv-builder.j2's
UV_PYTHON, Dockerfile.dev's FROM, bin/install.sh's host check, and every
CI workflow's setup-python pin.

Cleanup that falls out: drop the cache_scope / device_type / version_suffix
`pi4 + arm64 → pi4-64` re-mapping (board is now self-identifying), drop
the `c_rehash` workaround in Dockerfile.base.j2 (specific to a Balena
curl bug, not vanilla Debian), drop the dead arm/v6 + arm/v8 branches in
uv-builder.j2 (only arm/v7 remains as the 32-bit ARM target), retire the
old build_qt5.sh `pi1`/`pi4` branches, and delete docker/Dockerfile.celery
(left behind from the celery-image removal in 5e00c8b).

Out-of-band prereq before merging anything that depends on a viewer
build: cut a new `WebView-v*` release with
webview-{ver}-trixie-{board}.tar.gz (and qt5-5.15.14-trixie-{pi2,pi3}.tar.gz)
for the surviving boards, then bump WEBVIEW_VERSION in
tools/image_builder/utils.py:143. The webview Dockerfiles already point
at debian:trixie, so triggering build-webview.yaml on the new tag should
produce the artifacts.

Verification (proven via real `docker buildx --platform=...` runs):
- x86 server image: full build, runs Debian 13.4 + Python 3.13.5; Django
  5.2.13, channels 4.3.1, uvicorn 0.32.1 all import.
- x86 redis image: Redis 8.0.2 on trixie.
- pi3 (linux/arm/v7 under qemu) server image: full build green — Pi
  apt sources bootstrap works, libraspberrypi0 installs from
  raspbian/firmware/armhf with /opt/vc/lib/* present.
- pi3 (linux/arm/v7 under qemu) viewer image: 147s apt layer green
  end-to-end through libpulse-dev, libgstreamer1.0-dev, libsdl2-dev,
  libpng16-16t64, etc.; build proceeds through uv-builder + main stages
  and stops only at the WebView qt5 tarball fetch (the trixie artifacts
  haven't been cut yet — that's the prereq above).
- ruff check + ruff format --check on tools/image_builder/: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the project’s container/build tooling from Balena’s Debian Bookworm Raspberry Pi base images to vanilla debian:trixie, while also standardizing the repo on Python 3.13 and shrinking the supported device/build matrix.

Changes:

  • Switch Docker bases to debian:trixie and remove Balena-specific paths/workarounds; add conditional Raspberry Pi/Raspbian apt source bootstrapping for armhf (pi2/pi3).
  • Retire pi1 and the 32-bit pi4 stream; standardize identifiers on pi4-64 and update scripts/CI/Ansible accordingly.
  • Upgrade Python requirements/tooling to 3.13 across pyproject.toml, lockfile, Ruff/Mypy config, and GitHub workflows.

Reviewed changes

Copilot reviewed 25 out of 29 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
webview/scripts/build_webview.sh Updates WebView build script to target Trixie.
webview/docker/Dockerfile.x86 Switches x86 WebView build base to debian:trixie.
webview/docker/Dockerfile.pi5 Switches Pi5 WebView build base to debian:trixie.
webview/docker/Dockerfile.pi4 Switches Pi4 WebView build base to debian:trixie.
webview/build_qt5.sh Drops retired boards/branches and updates Qt5 build flow comments.
webview/Dockerfile Switches Qt builder + runtime to Trixie and bootstraps Pi apt sources/keyrings for armhf sysroot.
uv.lock Regenerates lock for requires-python >=3.13 and marker simplification.
tools/image_builder/utils.py Updates build params/board naming and viewer deps for Trixie package renames.
tools/image_builder/constants.py Shrinks build target options to pi2/pi3/pi4-64/pi5/x86.
tools/image_builder/main.py Removes pi4 remap logic, updates base tag to trixie, and gates Pi armhf specifics.
ruff.toml Sets Ruff target to Python 3.13.
pyproject.toml Sets requires-python and mypy python_version to 3.13.
docker/uv-builder.j2 Updates uv builder to use /usr/bin/python3 and simplifies 32-bit ARM detection.
docker/Dockerfile.dev Moves dev image base to python:3.13-trixie.
docker/Dockerfile.base.j2 Adds armhf-only Pi apt source/keyring bootstrap; removes Balena-specific c_rehash.
bin/upgrade_containers.sh Treats Pi4 as pi4-64 and errors on unsupported models.
bin/install.sh Updates messaging + requires Python >=3.13; treats Pi4 as pi4-64 and errors on unsupported models.
bin/deploy_to_balena.sh Updates supported board set and removes pi4→pi4-64 remap.
ansible/site.yml Updates allowed device_type values to the new matrix.
.python-version Pins repo Python version to 3.13.
.github/workflows/test-runner.yml Default Python becomes 3.13.
.github/workflows/python-mypy.yaml CI mypy runs on Python 3.13.
.github/workflows/python-lint.yaml CI lint runs on Python 3.13.
.github/workflows/generate-openapi-schema.yml OpenAPI schema generation uses Python 3.13.
.github/workflows/docker-test.yaml Docker test workflow uses Python 3.13.
.github/workflows/docker-build.yaml Updates build matrix boards + Python version; removes pi4 remap logic.
.github/workflows/build-webview.yaml Shrinks WebView build matrix; updates Qt5 job boards.
.github/workflows/build-balena-disk-image.yaml Updates Balena disk image build matrix/board mapping.
.github/workflows/ansible-lint.yaml Ansible lint workflow uses Python 3.13.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/Dockerfile.base.j2 Outdated
Comment thread webview/Dockerfile Outdated
Comment thread tools/image_builder/__main__.py Outdated
vpetersson and others added 3 commits April 29, 2026 16:41
Two CI failures from the Trixie/3.13 bump fall out of stdlib & lint:

- `lib/utils.py:8` imported `from distutils.util import strtobool`,
  which is gone in Python 3.12+. mypy on 3.13 flagged it as
  import-not-found. Inline the original truthy/falsy table directly in
  `string_to_bool` so every caller keeps accepting the same
  y/yes/t/true/on/1 / n/no/f/false/off/0 set.
- actionlint/shellcheck SC2129 on `.github/workflows/docker-build.yaml`
  in the `Set Docker tag` step I added — three sequential
  `>> "$GITHUB_ENV"` redirects collapse into one `{ ...; } >> $GITHUB_ENV`
  block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address Copilot's review on PR 2779.

- docker/Dockerfile.base.j2 + webview/Dockerfile: switch the Pi/Raspbian
  keyring downloads (and the resulting Deb822 `URIs:` for both apt
  archives) from `http://` to `https://`. Both archives serve TLS
  cleanly today (verified with curl --proto '=https' --tlsv1.2). The
  keyring .deb is the trust anchor for everything fetched after it, so
  the .deb hash is now also pinned via `sha256sum -c -` before
  `dpkg-deb -x` extracts it — TLS alone wouldn't catch an upstream
  archive-side swap. Hashes match the
  raspberrypi-archive-keyring_2025.1+rpt1_all.deb and
  raspbian-archive-keyring_20120528.4_all.deb files served at the time
  this commit lands; bumping either filename is the signal to refresh
  the pin too.
- tools/image_builder/__main__.py: trim the trailing space from
  `'libcec-dev '` in `base_apt_dependencies`. apt is forgiving about it
  but it produces extra whitespace in the rendered Dockerfile and is
  easy to miss in diffs.

Verified by re-running the keyring bootstrap end-to-end on a fresh
debian:trixie linux/arm/v7 container: both .debs pass sha256sum -c, apt
update fetches over HTTPS, and libraspberrypi0 installs from
archive.raspbian.org/raspbian trixie/firmware as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SonarCloud's docker:S6471 hotspot was already flagging this file on
master (the implicit-root warning lives on every `FROM debian:*` line
without a `USER` directive); my Trixie change shifted the original line
107 to 131 and Sonar re-emitted it as a "new in PR" finding. Resolve
with the rule's recommended escape hatch — declare the user explicitly,
which converts the implicit-default into an acknowledged choice and
silences the rule.

Both stages stay on `USER root`: the builder stage's `dpkg-deb -x` /
`dpkg --purge libraspberrypi-dev` and the runtime stage's writes to
/sysroot, /opt/vc, /root/.pyenv, /usr/local/bin all require root. This
image is a CI-local Qt 5 cross-compile builder that produces the
WebView tarball as a release artifact — it is never deployed, so the
"don't run as root" guidance behind S6471 doesn't apply in the way it
would for a published runtime image.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 30 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/image_builder/utils.py
Comment thread webview/Dockerfile
Comment thread docker/Dockerfile.base.j2
vpetersson and others added 2 commits April 30, 2026 05:34
- Dockerfile.base.j2: comment said libraspberrypi0 comes from
  archive.raspbian.org's `rpi` component, but the Deb822 source
  below correctly declares `Components: firmware`. Verified via
  Packages.gz on archive.raspbian.org/dists/trixie/firmware/
  binary-armhf — that's the only component shipping
  libraspberrypi0 on trixie/armhf. Comment now matches reality.

- image_builder/utils.py: Qt 5 branch comment claimed the modern
  equivalents (libgstreamer1.0-dev, libsqlite3-dev, libsrtp2-dev)
  for the dropped trixie packages were "pulled by the main viewer
  apt list above". libsqlite3-dev / libsrtp2-dev are indeed in
  that list, but libgstreamer1.0-dev is Qt 5-only and is added by
  the extend() call right below — corrected the comment to point
  there instead.

Both are pure comment changes; behavior unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both Docker-build steps in build-webview.yaml had ad-hoc caching that
left the bulk of layer state on the floor:

* `build-docker-image` (Pi 1-4 / Qt 5 builder) used
  `--cache-from screenly/ose-qt-builder:latest`, which is the
  image-tag-as-cache trick — only reuses the final manifest, never the
  apt-install + Qt cross-build intermediate layers, and silently no-ops
  the first time after a Dockerfile reorder invalidates the tag.
* `compile-webview-part-2` (Qt 6 / pi5+pi4-64+x86) shipped with
  `docker compose build` and zero cache config, so every PR rebuilt the
  per-board Qt 6 builder image cold.

Switch both to BuildKit's registry cache backend, identical pattern to
docker-build.yaml's `buildx` job: cache pushed to
`ghcr.io/screenly/anthias-webview-qt5-builder:buildcache` (Qt 5) and
`ghcr.io/screenly/anthias-webview-qt6-builder:buildcache-<board>`
(Qt 6, scoped per-board because the three Dockerfiles share almost
nothing). `mode=max,image-manifest=true` because GHCR rejects the
legacy standalone-cache manifest format on `ghcr.io/screenly/*`, same
constraint that bit the main workflow.

Auth-side details:

* Both jobs gain `permissions: { contents: read, packages: write }`,
  scoped per-job so other jobs don't inherit GHCR push.
* New "Login to GitHub Container Registry" step on each, gated on
  `event_name != 'pull_request'`. Fork PRs hand out a read-only
  GITHUB_TOKEN — cache-to would 401 mid-build — so `cache-to` is
  pushed-only-on-push, while `cache-from` runs unconditionally and
  warm-starts PRs off the latest master cache once the buildcache
  package is flipped public (same convention as anthias-server etc.).

Qt 6 build step had to switch from `docker compose build` to
`docker buildx bake -f docker-compose.yml --load --set <target>.cache-*`
because compose's YAML can't carry env-var-conditional cache_to without
emitting an empty list entry that buildx rejects. To keep the
subsequent `docker compose run` happy, the three Qt 6 services in
webview/docker-compose.yml gain explicit `image:` tags
(`webview-builder-{x86,pi5,pi4-64}`) so bake's `--load` puts the image
under a name compose looks up by tag rather than rebuilding it.

The Qt 5 job's old `Set buildx arguments` step (which assembled a
quoted string in $GITHUB_OUTPUT) is gone — build args inline in the
final `docker buildx build` invocation now, no GITHUB_OUTPUT
round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 31 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread webview/Dockerfile Outdated
Comment thread docker/Dockerfile.base.j2
Comment thread tools/image_builder/__main__.py Outdated
vpetersson and others added 4 commits April 30, 2026 05:51
Two intertwined fixes in webview/Dockerfile + the workflow that
publishes/consumes its image. CI never caught either because the
Docker-build step in build-webview.yaml is gated to push events, so
this Trixie-targeted Dockerfile has not yet built on master.

apt: drop the renamed-on-Trixie packages
  Stage 1 (armhf sysroot, archive.raspbian.org + deb.debian.org):
  * libgst-dev          → gone, libgstreamer1.0-dev (already listed)
                          replaces it
  * libsqlite0-dev      → gone, libsqlite3-dev (already listed) replaces
  * libsrtp0-dev        → gone in deb.debian.org/main; libsrtp2-dev
                          (already listed) is the trixie default
  * libpng16-16         → renamed libpng16-16t64 under the time_t
                          transition; old name is fully gone
  Stage 2 (amd64 runtime/builder, deb.debian.org):
  * libpng16-16         → libpng16-16t64
  Verified by GET on
  {deb.debian.org,archive.raspbian.org,archive.raspberrypi.org}/dists/
  trixie/main/binary-{armhf,amd64}/Packages.gz: every removed name is
  MISSING, every replacement is FOUND. Without this fix the first
  master push would die in stage 1's apt-get install.

GHCR migration: screenly/ose-qt-builder → ghcr.io/screenly/anthias-...
  Move the published Qt 5 builder image off Docker Hub and into the
  same GHCR namespace as the rest of the anthias-* artifacts. New ref
  is ghcr.io/screenly/anthias-webview-qt5-builder:latest (image) +
  :buildcache (cache, set up in eadd83d) — one repo, two tags, same
  auth flow.
  * build-docker-image: drop the Docker Hub login step, retag the
    push target to the GHCR ref via an IMAGE_REF env var.
  * compile-webview-part-1: declare permissions: { contents: read,
    packages: read }, add the GHCR login (gated on non-PR), point the
    `docker run` at the GHCR ref.
  Migration window: the GHCR package is created private on first push
  and needs to be flipped public so fork-PR runners (no GHCR auth) can
  pull. Same one-shot operational step as the existing anthias-*
  packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5e28919 fixed the same stale wording in docker/Dockerfile.base.j2
but missed the analogous comment block in
tools/image_builder/__main__.py — flagged by Copilot's second-pass
review.

The comment was a self-referential pointer to the apt-source bootstrap
in Dockerfile.base.j2, claiming libraspberrypi0 lives in
archive.raspbian.org's `rpi` component when in fact it ships under
`firmware` on trixie/armhf (the Deb822 entry written by the same code
correctly says `Components: firmware`). Reword to match reality and
add a note that this was verified against Packages.gz so a future
maintainer doesn't redo the lookup.

Pure comment change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a9b9522 migrated the Qt 5 builder image from
screenly/ose-qt-builder:latest (Docker Hub) to
ghcr.io/screenly/anthias-webview-qt5-builder:latest (GHCR), but the
publish step (`build-docker-image`) is gated to push events. On PR
runs the GHCR image therefore never exists, and the consumer
(compile-webview-part-1) blew up trying to `docker pull` it:

    Error response from daemon: Head ...manifests/latest: denied

The image is a CI-internal build artifact — only consumed by the next
step in the same workflow, never deployed, never pulled by any
external user. Publishing it as a registry artifact is just inventory
the workflow has to manage. So instead:

* Delete the `build-docker-image` job entirely.
* Move the build into compile-webview-part-1 as a step that runs on
  every event (PR + push), produces the image with `--load`, and tags
  it locally as `webview-qt5-builder:latest` for the subsequent
  `docker run` to consume.
* Keep the registry-cache backend on
  ghcr.io/screenly/anthias-webview-qt5-builder:buildcache so cold
  builds remain fast: `cache-from` always, `cache-to` only on
  push events (fork PRs have a read-only GITHUB_TOKEN and would 401
  on cache write — same gating as docker-build.yaml).

Side benefits:
* Removes the chicken-and-egg of "PR can't run because GHCR image
  doesn't exist; GHCR image only gets pushed on master".
* Drops the cross-job artifact handoff (and the auth dance to read
  the published image), so fork PRs work without any GHCR public-flip
  step.
* Two matrix runners (pi2, pi3) build in parallel from the same
  registry cache — second-onward runs hit cache for everything once
  the first push to master warms it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eadd83d added BuildKit registry-cache backends to both webview build
steps; 3dc0a04 kept them when moving the Qt 5 build inline. The
caching is purely a speed optimization — none of it is load-bearing
for correctness, fork PRs can't write cache anyway, and the per-job
GHCR login + permissions block is real surface area in exchange for
saving a few minutes on warm runs.

Strip it all back out:

* compile-webview-part-1: drop the GHCR login + `permissions:
  packages: write`. The "Build Qt 5 builder image" step is a plain
  `docker buildx build --load` now — same inline-build architecture
  from 3dc0a04, just no `--cache-from` / `--cache-to`.
* compile-webview-part-2: drop the GHCR login + `permissions:`,
  revert "Build Docker Image" from `docker buildx bake -f
  docker-compose.yml --load --set <target>.cache-*` back to plain
  `docker compose build`. COMPOSE_BAKE=true stays so compose still
  uses the bake builder under the hood — no behavior change beyond
  removing the cache flags.

webview/docker-compose.yml's explicit `image:` tags from eadd83d
stay in place: they happen to match the compose default
(`<project>-<service>`) so plain `docker compose build` produces
the same image names the previous bake invocation did, and `compose
run` finds them either way.

Cold pi2/pi3 builds will be ~9 min on every run instead of getting
fast on warm runs. That's fine for now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from Copilot April 30, 2026 06:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 31 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

build_webview.yaml's pi2/pi3 jobs fetch a pre-built Qt 5
cross-compile toolchain from a `WebView-v*` GitHub release
(webview/build_webview_with_qt5.sh:21 pins QT5_TOOLCHAIN_TAG to
WebView-v0.3.5). The trixie-targeted tarballs
qt5-5.15.14-trixie-{pi2,pi3}.tar.gz don't exist on any release yet —
the original Trixie commit (6531109) called out cutting them as an
out-of-band prereq. Until they exist, pi2/pi3 CI fails with
`sha256sum: no properly formatted checksum lines found` because curl
falls back to a 404 HTML page on the missing .sha256 URL.

This helper produces those tarballs locally:

* Builds webview/Dockerfile (the same image CI's
  compile-webview-part-1 builds inline) once, --load only.
* Runs build_qt5.sh inside that image once per requested board (pi2
  by default, pi3 by default, or whichever boards are passed on the
  command line). Sequential because Qt 5 + QtWebEngine peaks at ~16
  GB RAM per build and the Linaro cross-compile toolchain extracted
  into .qt5-toolchain-build/src/ is shared between boards.
* Drops outputs at .qt5-toolchain-build/release/qt5-5.15.14-trixie-
  {pi2,pi3}.tar.gz (+ .sha256), ready to upload via
  `gh release upload`.

Idempotent: existing release/<tarball>.tar.gz short-circuits the run
for that board. ccache state is preserved across runs at
.qt5-toolchain-build/ccache/. BUILD_WEBVIEW=0 in the env skips the
bonus webview-* tarball that build_qt5.sh otherwise produces (the
Dockerfile defaults BUILD_WEBVIEW=1 so the helper inherits that
default for parity with the previous CI flow).

The .qt5-toolchain-build/ directory is intentionally hidden + at
the repo root rather than ~/tmp so it's discoverable to whoever
runs this next without grep'ing scrollback for a path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 32 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/build-webview.yaml
vpetersson and others added 2 commits April 30, 2026 14:47
# Conflicts:
#	tools/image_builder/__main__.py
…ls on trixie

The webview/Dockerfile in this repo wasn't actually exercised end-to-end
before — master CI uses screenly/ose-qt-builder from Docker Hub, and the
inline-build path introduced for trixie only ran build_webview_with_qt5.sh
(which downloads prebuilt qt5 toolchains). Rebuilding those toolchains for
trixie surfaced four real bugs:

* python interpreter never on PATH for non-interactive shells. The pyenv
  block only wired itself up via ~/.bashrc, which doesn't load when the
  rebuild script does `docker run /webview/build_qt5.sh`. Replace pyenv
  with apt-pinned python2.7 from archive.debian.org bullseye (trixie main
  dropped py2 entirely; bullseye archive still ships 2.7.18). Pin only
  python2.7 + its libpython runtime libs, leave everything else on trixie.
  Symlink /usr/local/bin/python -> python2.7 so QtWebEngine's
  `/usr/bin/env python` resolves.

* QtWebEngine configure silently rejected fontconfig because the sysroot
  was missing /usr/share/pkgconfig/bzip2.pc. The Dockerfile only copies
  /lib, /usr/include, /usr/lib from the builder stage; on trixie's
  libbz2-dev the .pc file lives in /usr/share/pkgconfig (arch-indep),
  so freetype2.pc's `Requires.private: bzip2` failed to resolve, which
  cascaded into fontconfig: no, which silently dropped QtWebEngine from
  the build. Add the missing COPY.

* Several QtWebEngine-required dev libs missing from the sysroot
  (libharfbuzz-dev, liblcms2-dev, libre2-dev, libxml2-dev). Same libs
  also need to be installed on the *host* runtime stage because chromium
  pdfium evaluates `harfbuzz_from_pkgconfig` in the host toolchain
  context, where Qt's host_pkg_config="/usr/bin/pkg-config" drops the
  sysroot args from chromium's pkg_config template.

* `make -j$(nproc)+2` OOMs on >8-core hosts. cc1plus under qemu-arm
  peaks at ~3-4 GB during chromium compile, so the default formula
  needs ~50 GB on a 16-core box. Make MAKE_CORES env-overridable in
  build_qt5.sh and have rebuild_qt5_toolchain.sh cap at min(nproc, 8).

Also: -webengine-proprietary-codecs in the configure args so the
resulting QtWebEngine supports H.264/AAC/MP3 (matches what Debian
qt6-webengine ships).

Verified on a 16-core/22GB+32GB-swap host: produces
qt5-5.15.14-trixie-{pi2,pi3}.tar.gz (88M, 98M) with 251 webengine entries
each, plus the matching webview-*.tar.gz apps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vpetersson and others added 3 commits April 30, 2026 14:52
Trixie qt5-5.15.14-trixie-{pi2,pi3} toolchain tarballs are published on
the new WebView-v2026.04.1 release; the previous WebView-v0.3.5 only
ships the bookworm tarballs and is now unreachable for trixie pi2/pi3 CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h hint

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_HASH

Both `.github/workflows/build-webview.yaml` and `bin/rebuild_qt5_toolchain.sh`
were populating the GIT_HASH build arg with the *short* hash, making
GIT_HASH and GIT_SHORT_HASH identical and stripping the unambiguous
SHA needed by `lib/diagnostics.py:os.getenv('GIT_HASH')` for downstream
traceability. Pass `git rev-parse HEAD` for GIT_HASH and reserve
`--short HEAD` for GIT_SHORT_HASH (which is already what
`tools/image_builder/__main__.py` does for the main service images).

Caught in Copilot review of #2779.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 34 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The viewer image's `COPY . /usr/src/app/` was slurping in 1.6 GB of
local Qt 5 cross-build state (`.qt5-toolchain-build/`) plus 69 MB of
`.mypy_cache/`, inflating every viewer/server image by ~1.7 GB even
though the build needs none of it. Add those plus `.ruff_cache`,
`.idea`, `.cursor`, `.claude`, `.cache`, and tighten the existing
`*.git` / `*.github` globs (which match files ending in `.git` /
`.github` but not the directories themselves on most matchers) to
the literal directory names.

Caught while validating the trixie 5-board matrix: x86 viewer was
6.28 GB and pi5 viewer 2.23 GB; both had the same 1.76 GB COPY layer
that's mostly `.qt5-toolchain-build/`. Fixed image should be ~5 MB
for COPY and ~1.5 GB for the viewer overall.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

@vpetersson vpetersson merged commit d9ebc80 into master Apr 30, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants