Skip to content

ci: upgrade to Fission v1.25.0 and modernize workflows#436

Merged
sanketsudake merged 1 commit into
masterfrom
ci-fission-1.25
Jun 6, 2026
Merged

ci: upgrade to Fission v1.25.0 and modernize workflows#436
sanketsudake merged 1 commit into
masterfrom
ci-fission-1.25

Conversation

@sanketsudake

Copy link
Copy Markdown
Member

What

First of a series of PRs updating all environments to latest secure versions. This one touches only CI/build infra (no env directories), so it should merge first.

Fission v1.25.0

  • FISSION_VERSION v1.21.0 → v1.25.0 in rules.mk, workflow env, setup-cluster action default.
  • skaffold helm remoteChartfission-all-1.25.0 (note: new chart release tags drop the v prefix; URL verified live).

Cluster tooling

  • kind v0.23.0 → v0.32.0, kindest/node v1.27.16 (k8s 1.27 is EOL) → v1.34.8 digest-pinned — matches the kind versions fission v1.25.0 itself tests against.
  • helm v3.18.4 → v3.21.0.
  • Azure/setup-helm → v4.3.1, engineerd/setup-kind → v0.6.2 (both SHA-pinned), actions/setup-python v2 → v6, checkout/paths-filter/qemu/buildx bumps in release.yaml.

Workflow fixes & optimizations

  • concurrency + cancel-in-progress so superseded PR pushes don't burn runners.
  • Fixed python job gate: contains( needs.check.outputs, 'python' ) was missing .packages. Also switched python/jvm gates to exact-match ('"jvm"') so python-fastapi/jvm-jersey changes no longer cross-trigger the python/jvm jobs.
  • Added the missing jvm-jersey job (a path filter existed but no job consumed it — jvm-jersey changes were never CI-built).
  • jvm job: removed redundant paths-filter step, added Collect Fission Dump on failure.
  • Collect Fission Dump now runs only on failure in binary/go jobs (previously ran unconditionally).

Release pipeline repair (hack/release_check.py)

The release gate has been silently broken — evidence: python-env envconfig.json says 1.34.3 but GHCR only has 1.34.2. Three bugs fixed:

  1. GHCR tags/list requires a bearer token (anonymous token works for public images); the unauthenticated call always 401ed.
  2. tag in json_resp["images"] — the images key never existed (should be tags), so the check could never return "already published".
  3. ::set-output workflow command has been removed by GitHub; now writes to $GITHUB_OUTPUT. Also release_needed now emits/gates on 'true' consistently (previously json.dumps emitted true but the workflow compared against 'True').

Error handling is now fail-closed: auth/rate-limit/5xx responses raise and fail the check job instead of being misread as "image absent" (which previously re-pushed every image including latest).

Verified by running the script live against GHCR: correctly detects published go-env tags and the unpublished python-env:1.34.3.

Note: once this merges, the next release run will publish any envconfig versions that were never released due to the broken gate (e.g. python-env:1.34.3).

Follow-up suggestions (not in this PR)

  • Add dependabot.yml ecosystem for github-actions.
  • Docker layer caching (type=gha) for CI image builds.
  • Functional smoke tests for build-only envs (perl/php7/ruby/tensorflow/jvm-jersey).

Verification

  • YAML validated; release_check.py exercised live against GHCR (both bracket-quoted and plain package-list formats).
  • This PR touches no env dirs, so no env jobs trigger; the workflow changes take effect for subsequent env PRs in this series.

🤖 Generated with Claude Code

- Bump FISSION_VERSION v1.21.0 -> v1.25.0 in rules.mk, workflow env,
  setup-cluster action default and skaffold helm chart URL (new chart
  tags drop the 'v' prefix: fission-all-1.25.0)
- Bump kind v0.23.0 -> v0.32.0 with kindest/node v1.34.8 (digest
  pinned; matches fission v1.25.0's own CI matrix), helm -> v3.21.0
- Action bumps: setup-helm v4.3.1, setup-kind v0.6.2 (SHA pinned),
  setup-python v6, checkout v4, paths-filter v3, qemu/buildx v3
- Add workflow concurrency with cancel-in-progress for PR runs
- Fix python job gate referencing needs.check.outputs instead of
  .packages; use exact-match quotes for python/jvm so python-fastapi
  and jvm-jersey changes no longer cross-trigger those jobs
- Add missing jvm-jersey build job (filter existed, job did not)
- jvm job: drop redundant paths-filter step, add Collect Fission Dump
- Run Collect Fission Dump only on failure in binary/go jobs
- release_check.py: fix GHCR tag check (anonymous bearer token; the
  unauthenticated call always 401ed and the 'images' key never
  existed, so every release re-pushed all images), fail closed on
  auth/rate-limit/5xx errors, write to GITHUB_OUTPUT instead of the
  removed ::set-output command, and gate release_needed on 'true'

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sanketsudake added a commit that referenced this pull request Jun 6, 2026
…r Hub image

The jvm e2e test relied on the utils.sh default JVM_RUNTIME_IMAGE
(fission/jvm-env from Docker Hub, years out of date), so the image
loaded into kind by `make jvm-test-images` was never exercised and the
freshly built example jar (release 17 bytecode) failed to load on the
ancient runtime. Pin JVM_RUNTIME_IMAGE=jvm-env in the test, matching
the go/binary/nodejs/python test convention.

Also add Collect Fission Dump on failure to the jvm job (same hunk as
in #436) so future e2e failures leave pod-level diagnostics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sanketsudake sanketsudake merged commit 9d80d1c into master Jun 6, 2026
15 checks passed
sanketsudake added a commit that referenced this pull request Jun 6, 2026
* ci: fix broken setup-kind action with helm/kind-action

engineerd/setup-kind v0.6.2 fails with 'File not found:
dist/main/index.js' (the tagged commit does not ship the compiled
action), breaking the setup-cluster step for every env job. Switch to
the maintained helm/kind-action v1.14.0 with cluster_name kind so the
kind-kind kubectl context and `kind load docker-image` defaults keep
working.

Also make collect-fission-dump best effort: a missing fission CLI
(setup failed before installing it) or empty dump no longer masks the
original job failure.

The broken bump in #436 went unnoticed because that PR touched no env
directories, so no job exercised setup-cluster pre-merge; this PR
includes a perl README link fix so the perl job validates the action
end to end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci(perl): run skaffold via make instead of the stale skaffold action

Same fix as #442/#443/#444: hiberbee/github-action-skaffold pins
skaffold 2.3.1, which cannot parse the repo's skaffold/v4beta13
config. Needed here so the perl job can validate the kind-action
hotfix end to end (identical hunk to #444, merges cleanly).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sanketsudake added a commit that referenced this pull request Jun 6, 2026
…n from master which supersedes this branch's jvm-job hunk
sanketsudake added a commit that referenced this pull request Jun 6, 2026
… source (#437)

* fix(jvm): restore Java env build by installing fission-java-core from source

io.fission:fission-java-core:0.0.2-SNAPSHOT was only ever published to
the oss.sonatype.org OSSRH snapshots repository, which has been
decommissioned, so the jvm env, its builder and the example function
could no longer resolve the dependency and CI failed.

- Add install-fission-java-core.sh which builds the artifact from a
  pinned commit of fission/fission-java-libs and installs it into the
  local Maven repository under 0.0.1, 0.0.2 and 0.0.2-SNAPSHOT (the
  SNAPSHOT version keeps pre-existing user functions building against
  the jvm-builder image working). The script normalizes the library
  pom's missing XML namespace declarations which strict plugin parsers
  reject. Duplicated into jvm/builder/ because the builder image is
  built with that directory as its Docker context.
- Reference fission-java-core 0.0.2 (non-SNAPSHOT) from jvm/pom.xml and
  the example pom, and drop the dead oss.sonatype.org <repositories>
  blocks.
- jvm/tests/test_java_env.sh installs the artifact inside the clean
  maven build container before packaging the example function.
- Upgrade Spring Boot 3.3.2 -> 3.5.14, Java 22 -> 25 LTS
  (maven 3.9.16-eclipse-temurin-25-alpine / eclipse-temurin:25-jdk-alpine),
  example spring-boot-starter-web 2.0.1.RELEASE -> 3.5.14 and
  surefire 2.22.1 -> 3.5.4.
- Bump envconfig version to 1.32.0 (runtimeVersion 25) and regenerate
  environments.json via make update-env-json (it was stale).

Verified locally: env + builder images build, the example jar builds
exactly as the CI test does, and a container smoke test passes
(/v2/specialize 200, invoke returns "Hello World!" on Java 25.0.3 /
Spring Boot 3.5.14).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(jvm): test the locally built jvm-env image instead of stale Docker Hub image

The jvm e2e test relied on the utils.sh default JVM_RUNTIME_IMAGE
(fission/jvm-env from Docker Hub, years out of date), so the image
loaded into kind by `make jvm-test-images` was never exercised and the
freshly built example jar (release 17 bytecode) failed to load on the
ancient runtime. Pin JVM_RUNTIME_IMAGE=jvm-env in the test, matching
the go/binary/nodejs/python test convention.

Also add Collect Fission Dump on failure to the jvm job (same hunk as
in #436) so future e2e failures leave pod-level diagnostics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
sanketsudake added a commit that referenced this pull request Jun 6, 2026
…endencies (#439)

python:
- Base image 3.11-alpine -> 3.13-alpine everywhere (Dockerfiles,
  Makefile buildargs incl. the previously missed builder Makefiles,
  skaffold buildArgs, READMEs)
- requirements.txt: all pins to latest — Flask 2.1.1 -> 3.1.3,
  Werkzeug 2.2.2 -> 3.1.8, gevent 22.10.2 -> 26.5.0, greenlet 3.5.1,
  bjoern 3.2.2, redis 8.0.0, requests 2.34.2, sentry-sdk 2.61.1,
  urllib3 2.7.0 and friends
- flask_sockets.py: import parse_cookie from werkzeug.sansio.http
  (removed from werkzeug.http in Werkzeug 2.3+)

python-fastapi:
- Base image 3.13-alpine; fastapi 0.114.0 -> 0.136.3,
  uvicorn 0.30.6 -> 0.49.0

CI:
- Fix python job gate (contains was missing .packages, so the job
  never triggered); exact-match quotes to stop python-fastapi changes
  cross-triggering the python job (hunks mirror #436)
- setup-python @v2 -> @v6, python-version 3.13 in both python jobs

envconfig version 1.35.0 for both; environments.json regenerated.

Verified locally on python 3.13: full dependency install (bjoern
compiles), server in bjoern mode (all HTTP methods), GEVENT mode
(flask_sockets path) and fastapi server all pass healthz/specialize/
invoke; env + builder images build on 3.13-alpine.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant