feat(featureflags): add build-kernel-version flag for new builds#2592
feat(featureflags): add build-kernel-version flag for new builds#2592ValentaTomas merged 5 commits intomainfrom
Conversation
Mirror the existing build-firecracker-version flag with a build-kernel-version flag so the kernel baked into NEW template builds can be rolled forward without a code change. Existing snapshots/builds keep their pinned kernel version. Flag name matches what orbit (belt) already references.
PR SummaryMedium Risk Overview The API still includes kernel/firecracker versions in build requests while the orchestrator now resolves kernel via feature flags too, so inconsistent flag context or future partial removal on one side could cause mismatched/pinned versions and reduce build reproducibility. Reviewed by Cursor Bugbot for commit 127ca29. Bugbot is set up for automated code reviews on this repo. Configure here. |
❌ 9 Tests Failed:
View the full list of 10 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
LGTM — straightforward mirror of the existing build-firecracker-version flag.
Extended reasoning...
Overview
This PR adds a BuildKernelVersion LaunchDarkly string flag that mirrors the existing BuildFirecrackerVersion flag, lets the API handlers and orchestrator resolve the kernel version baked into new template builds via LD instead of from Config.DefaultKernelVersion/featureflags.DefaultKernelVersion, and introduces an opt-in NewStringFlagFallbackOnEmpty constructor so an empty LD string also falls through to the fallback. Five files touched; the diff is mechanical.
Security risks
None identified. No auth, crypto, permission, or input-validation surfaces are touched. The feature flag plumbing is additive and the new fallbackWhenEmpty field is opt-in (only BuildKernelVersion uses it), so behavior of every existing StringFlag is unchanged.
Level of scrutiny
Low–medium. The change influences which kernel image new builds get pinned to, but existing snapshots/builds keep their stored env_builds.kernel_version (set at build time and not re-read on resume), and the LD default is wired through the same DEFAULT_KERNEL_VERSION env var → in-tree DefaultKernelVersion constant chain that the previous code used. The shape is identical to the already-shipped BuildFirecrackerVersion flag.
Other factors
The only bug surfaced is a nit (the Config.DefaultKernelVersion field — already marked // Deprecated — and its post-parse default-fill block are now write-only) and is explicitly out of scope. The integration-tests CI failure shown in the timeline is a 'matrix result: cancelled' aggregate, not a real test failure. Recent commits (refactor(featureflags): bake empty-fallback into StringFlag opt-in, fix(featureflags): fall back to DefaultKernelVersion when LD flag is empty) show the author already iterated on the empty-fallback semantics in response to earlier feedback.
| MemoryMB: body.MemoryMB, | ||
| Version: templates.TemplateV1Version, | ||
| KernelVersion: a.config.DefaultKernelVersion, | ||
| KernelVersion: kernelVersion, |
There was a problem hiding this comment.
🟡 Nit: this PR removes the only consumers of Config.DefaultKernelVersion (both a.config.DefaultKernelVersion reads in this file and in template_request_build_v3.go are replaced with the BuildKernelVersion feature flag). That leaves the DefaultKernelVersion field in packages/api/internal/cfg/model.go (line 71, already marked // Deprecated) and its default-fill block (lines 141-142) as dead code — worth a follow-up cleanup, not blocking this PR.
Extended reasoning...
What's now dead
After this PR, the API's Config.DefaultKernelVersion field has no remaining readers:
packages/api/internal/handlers/deprecated_template_request_build.go:210— wasKernelVersion: a.config.DefaultKernelVersion, nowKernelVersion: kernelVersion(resolved from theBuildKernelVersionfeature flag).packages/api/internal/handlers/template_request_build_v3.go:133— same swap.
A grep for DefaultKernelVersion across packages/api after this PR returns only three hits, all inside packages/api/internal/cfg/model.go itself:
packages/api/internal/cfg/model.go:71: DefaultKernelVersion string `env:"DEFAULT_KERNEL_VERSION"`
packages/api/internal/cfg/model.go:141: if config.DefaultKernelVersion == "" {
packages/api/internal/cfg/model.go:142: config.DefaultKernelVersion = featureflags.DefaultKernelVersion
The field already carries a // Deprecated: Template manager should use its own comment on line 70, suggesting the author was aware it was on its way out — this PR finishes that job at the call sites but leaves the struct field and its post-parse default-fill behind.
Why removal is safe
The DEFAULT_KERNEL_VERSION env var that this field reads is still functional, but it is now consumed independently in packages/shared/pkg/featureflags/flags.go:
BuildKernelVersion = NewStringFlagFallbackOnEmpty("build-kernel-version", env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion))So removing the API config field does not change observable behavior: the env var still flows into the kernel-version chosen for new builds, just via the feature-flag fallback path instead of via the API Config struct.
Step-by-step proof
- Before this PR: API service starts →
cfg.Parse()readsDEFAULT_KERNEL_VERSIONenv intoconfig.DefaultKernelVersion(and falls back tofeatureflags.DefaultKernelVersionon empty). Build handlers reada.config.DefaultKernelVersionto populateRegisterBuildData.KernelVersion. - After this PR: API service starts → same parse happens, but no handler reads the field. Build handlers call
a.featureFlags.StringFlag(ctx, featureflags.BuildKernelVersion)instead. TheBuildKernelVersionflag was constructed at package init fromenv.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion), so the same env var still drives the value. - Net effect: lines 71 and 141-142 of
cfg/model.goare write-only — nothing reads them, and removing them has no functional impact.
Impact / fix
Zero runtime impact — purely cleanup. Fix is to delete the DefaultKernelVersion field from the Config struct and the corresponding if config.DefaultKernelVersion == "" { ... } block in Parse(). Out of scope for this PR (the file isn't in the diff), so flagging as a follow-up nit.
Code Review by Qodo
1. BuildFirecrackerVersion missing empty-string fallback
|
| BuildFirecrackerVersion = NewStringFlag("build-firecracker-version", env.GetEnv("DEFAULT_FIRECRACKER_VERSION", DefaultFirecrackerVersion)) | ||
| BuildKernelVersion = NewStringFlagFallbackOnEmpty("build-kernel-version", env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion)) |
There was a problem hiding this comment.
1. Buildfirecrackerversion missing empty-string fallback 🐞 Bug ☼ Reliability
The PR adds FallbackWhenEmpty protection to BuildKernelVersion but not to BuildFirecrackerVersion.
If LaunchDarkly evaluates build-firecracker-version to an empty string, the empty value propagates
to fcversion.New(""), which returns an error and aborts every template build.
Agent Prompt
## Issue description
BuildFirecrackerVersion is created with NewStringFlag, which does not fall back when LaunchDarkly returns an empty string. The PR introduces NewStringFlagFallbackOnEmpty for BuildKernelVersion but omits the same protection for BuildFirecrackerVersion. An empty LD evaluation propagates to fcversion.New(""), which errors and aborts all template builds.
## Issue Context
The StringFlag.FallbackWhenEmpty mechanism was added in this PR specifically to handle the case where a flag is defined in LD without a value. Both kernel and firecracker version flags share the same risk profile, but only the kernel flag opts in.
## Fix Focus Areas
- packages/shared/pkg/featureflags/flags.go[278-278]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
The empty-string fallback mechanism (and NewStringFlagFallbackOnEmpty) was reverted in c9309f1, so BuildKernelVersion and BuildFirecrackerVersion now have identical behavior — both will propagate an empty LD value. This was an explicit decision to keep the kernel flag consistent with the FC flag.
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
Mirrors the existing
build-firecracker-versionflag with a newbuild-kernel-versionflag so the kernel baked into NEW template builds can be rolled forward via LaunchDarkly without a code change. Existing snapshots/builds keep their pinned kernel version (env_builds.kernel_versionis set at build time and not touched on resume).Flag name matches what orbit (belt) already references (
LD_FLAG_BUILD_KERNEL_VERSION = 'build-kernel-version').Default value falls back to
DEFAULT_KERNEL_VERSIONenv var, then to the in-treeDefaultKernelVersionconstant — same shape as the FC flag.