Skip to content

feat(featureflags): add build-kernel-version flag for new builds#2592

Merged
ValentaTomas merged 5 commits intomainfrom
feat/build-kernel-version-flag
May 8, 2026
Merged

feat(featureflags): add build-kernel-version flag for new builds#2592
ValentaTomas merged 5 commits intomainfrom
feat/build-kernel-version-flag

Conversation

@ValentaTomas
Copy link
Copy Markdown
Member

Mirrors the existing build-firecracker-version flag with a new build-kernel-version flag so the kernel baked into NEW template builds can be rolled forward via LaunchDarkly without a code change. Existing snapshots/builds keep their pinned kernel version (env_builds.kernel_version is set at build time and not touched on resume).

Flag name matches what orbit (belt) already references (LD_FLAG_BUILD_KERNEL_VERSION = 'build-kernel-version').

Default value falls back to DEFAULT_KERNEL_VERSION env var, then to the in-tree DefaultKernelVersion constant — same shape as the FC flag.

Mirror the existing build-firecracker-version flag with a build-kernel-version
flag so the kernel baked into NEW template builds can be rolled forward without
a code change. Existing snapshots/builds keep their pinned kernel version.

Flag name matches what orbit (belt) already references.
@cla-bot cla-bot Bot added the cla-signed label May 8, 2026
@cursor
Copy link
Copy Markdown

cursor Bot commented May 8, 2026

PR Summary

Medium Risk
Changes how kernel versions are chosen for new template builds via LaunchDarkly, which can unintentionally roll forward kernels across clusters/teams if the flag is misconfigured. The API/orchestrator may now diverge on which source of truth is used, impacting build reproducibility.

Overview
Kernel version selection for new template builds is now driven by the build-kernel-version LaunchDarkly flag instead of the API config default; if the flag is unset/mis-scoped, builds may unexpectedly change kernels.

The API still includes kernel/firecracker versions in build requests while the orchestrator now resolves kernel via feature flags too, so inconsistent flag context or future partial removal on one side could cause mismatched/pinned versions and reduce build reproducibility.

Reviewed by Cursor Bugbot for commit 127ca29. Bugbot is set up for automated code reviews on this repo. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

❌ 9 Tests Failed:

Tests completed Failed Passed Skipped
2594 9 2585 7
View the full list of 10 ❄️ flaky test(s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/sandboxes::TestUpdateNetworkConfig

Flake rate in main: 72.91% (Passed 55 times, Failed 148 times)

Stack Traces | 192s run time
=== RUN   TestUpdateNetworkConfig
=== PAUSE TestUpdateNetworkConfig
=== CONT  TestUpdateNetworkConfig
--- FAIL: TestUpdateNetworkConfig (192.06s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/sandboxes::TestUpdateNetworkConfig/pause_resume_preserves_allow_internet_access_false

Flake rate in main: 73.10% (Passed 53 times, Failed 144 times)

Stack Traces | 1.32s run time
=== RUN   TestUpdateNetworkConfig/pause_resume_preserves_allow_internet_access_false
Executing command curl in sandbox iivs4uhd1nk54p855ea6v
    sandbox_network_update_test.go:372: Command [curl] output: event:{start:{pid:1366}}
    sandbox_network_update_test.go:372: Command [curl] output: event:{end:{exit_code:35  exited:true  status:"exit status 35"  error:"exit status 35"}}
    sandbox_network_update_test.go:372: Command [curl] output: event:{start:{pid:1367}}
    sandbox_network_update_test.go:372: Command [curl] output: event:{end:{exit_code:35  exited:true  status:"exit status 35"  error:"exit status 35"}}
    sandbox_network_update_test.go:391: Command [curl] output: event:{start:{pid:1368}}
    sandbox_network_update_test.go:391: Command [curl] output: event:{data:{stdout:"HTTP/2 302 \r\nx-content-type-options: nosniff\r\nlocation: https://dns.google/\r\ndate: Fri, 08 May 2026 01:10:32 GMT\r\ncontent-type: text/html; charset=UTF-8\r\nserver: HTTP server (unknown)\r\ncontent-length: 216\r\nx-xss-protection: 0\r\nx-frame-options: SAMEORIGIN\r\nalt-svc: h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000\r\n\r\n"}}
    sandbox_network_update_test.go:391: Command [curl] output: event:{end:{exited:true  status:"exit status 0"}}
    sandbox_network_update_test.go:391: Command [curl] completed successfully in sandbox iivs4uhd1nk54p855ea6v
    sandbox_network_update_test.go:391: 
        	Error Trace:	.../api/sandboxes/sandbox_network_out_test.go:74
        	            				.../api/sandboxes/sandbox_network_update_test.go:60
        	            				.../api/sandboxes/sandbox_network_update_test.go:391
        	Error:      	An error is expected but got nil.
        	Test:       	TestUpdateNetworkConfig/pause_resume_preserves_allow_internet_access_false
        	Messages:   	https://8.8.8.8 should be blocked
--- FAIL: TestUpdateNetworkConfig/pause_resume_preserves_allow_internet_access_false (1.32s)
github.com/e2b-dev/infra/tests/integration/internal/tests/api/sandboxes::TestUpdateNetworkConfig/pause_resume_preserves_rules

Flake rate in main: 52.27% (Passed 42 times, Failed 46 times)

Stack Traces | 0.04s run time
=== RUN   TestUpdateNetworkConfig/pause_resume_preserves_rules
Executing command curl in sandbox ikzb42n4qavd63i9ahj68
    sandbox_network_update_test.go:341: Command [curl] output: event:{start:{pid:1344}}
    sandbox_network_update_test.go:341: 
        	Error Trace:	.../api/sandboxes/sandbox_network_out_test.go:67
        	            				.../api/sandboxes/sandbox_network_update_test.go:58
        	            				.../api/sandboxes/sandbox_network_update_test.go:341
        	Error:      	Received unexpected error:
        	            	failed to execute command curl in sandbox ikzb42n4qavd63i9ahj68: invalid_argument: protocol error: incomplete envelope: unexpected EOF
        	Test:       	TestUpdateNetworkConfig/pause_resume_preserves_rules
        	Messages:   	https://8.8.8.8 should be reachable
--- FAIL: TestUpdateNetworkConfig/pause_resume_preserves_rules (0.04s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestBindLocalhost

Flake rate in main: 52.54% (Passed 84 times, Failed 93 times)

Stack Traces | 0s run time
=== RUN   TestBindLocalhost
=== PAUSE TestBindLocalhost
=== CONT  TestBindLocalhost
--- FAIL: TestBindLocalhost (0.00s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestBindLocalhost/bind_0_0_0_0

Flake rate in main: 58.12% (Passed 49 times, Failed 68 times)

Stack Traces | 8.01s run time
=== RUN   TestBindLocalhost/bind_0_0_0_0
=== PAUSE TestBindLocalhost/bind_0_0_0_0
=== CONT  TestBindLocalhost/bind_0_0_0_0
    localhost_bind_test.go:69: Command [python] output: event:{start:{pid:1251}}
    localhost_bind_test.go:90: 
        	Error Trace:	.../tests/envd/localhost_bind_test.go:90
        	Error:      	Not equal: 
        	            	expected: 200
        	            	actual  : 502
        	Test:       	TestBindLocalhost/bind_0_0_0_0
        	Messages:   	Unexpected status code 502 for bind address 0.0.0.0
--- FAIL: TestBindLocalhost/bind_0_0_0_0 (8.01s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestBindLocalhost/bind_127_0_0_1

Flake rate in main: 52.83% (Passed 50 times, Failed 56 times)

Stack Traces | 7.19s run time
=== RUN   TestBindLocalhost/bind_127_0_0_1
=== PAUSE TestBindLocalhost/bind_127_0_0_1
=== CONT  TestBindLocalhost/bind_127_0_0_1
Executing command python in sandbox ikkwjmc6miekcjcykzgg5
    localhost_bind_test.go:69: Command [python] output: event:{start:{pid:1250}}
    localhost_bind_test.go:90: 
        	Error Trace:	.../tests/envd/localhost_bind_test.go:90
        	Error:      	Not equal: 
        	            	expected: 200
        	            	actual  : 502
        	Test:       	TestBindLocalhost/bind_127_0_0_1
        	Messages:   	Unexpected status code 502 for bind address 127.0.0.1
--- FAIL: TestBindLocalhost/bind_127_0_0_1 (7.19s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestBindLocalhost/bind_::1

Flake rate in main: 59.50% (Passed 49 times, Failed 72 times)

Stack Traces | 8.99s run time
=== RUN   TestBindLocalhost/bind_::1
=== PAUSE TestBindLocalhost/bind_::1
=== CONT  TestBindLocalhost/bind_::1
    localhost_bind_test.go:69: Command [python] output: event:{start:{pid:1250}}
Executing command python in sandbox i9lvnn731t7uq2t21t97n
    localhost_bind_test.go:90: 
        	Error Trace:	.../tests/envd/localhost_bind_test.go:90
        	Error:      	Not equal: 
        	            	expected: 200
        	            	actual  : 502
        	Test:       	TestBindLocalhost/bind_::1
        	Messages:   	Unexpected status code 502 for bind address ::1
--- FAIL: TestBindLocalhost/bind_::1 (8.99s)
github.com/e2b-dev/infra/tests/integration/internal/tests/envd::TestBindLocalhost/bind_localhost

Flake rate in main: 59.50% (Passed 49 times, Failed 72 times)

Stack Traces | 7.67s run time
=== RUN   TestBindLocalhost/bind_localhost
=== PAUSE TestBindLocalhost/bind_localhost
=== CONT  TestBindLocalhost/bind_localhost
Executing command python in sandbox ic9gt0zwa01pgz1vj251x
    localhost_bind_test.go:69: Command [python] output: event:{start:{pid:1250}}
    localhost_bind_test.go:90: 
        	Error Trace:	.../tests/envd/localhost_bind_test.go:90
        	Error:      	Not equal: 
        	            	expected: 200
        	            	actual  : 502
        	Test:       	TestBindLocalhost/bind_localhost
        	Messages:   	Unexpected status code 502 for bind address localhost
--- FAIL: TestBindLocalhost/bind_localhost (7.67s)
Executing command python in sandbox it7dzp0pxbhjwjzxhtbex
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity

Flake rate in main: 60.67% (Passed 59 times, Failed 91 times)

Stack Traces | 83.5s run time
=== RUN   TestSandboxMemoryIntegrity
=== PAUSE TestSandboxMemoryIntegrity
=== CONT  TestSandboxMemoryIntegrity
    sandbox_memory_integrity_test.go:26: Build completed successfully
--- FAIL: TestSandboxMemoryIntegrity (83.52s)
github.com/e2b-dev/infra/tests/integration/internal/tests/orchestrator::TestSandboxMemoryIntegrity/tmpfs_hash

Flake rate in main: 63.43% (Passed 49 times, Failed 85 times)

Stack Traces | 50.6s run time
=== RUN   TestSandboxMemoryIntegrity/tmpfs_hash
=== PAUSE TestSandboxMemoryIntegrity/tmpfs_hash
=== CONT  TestSandboxMemoryIntegrity/tmpfs_hash
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{start:{pid:1264}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Total memory: 985 MB\nUsed memory before tmpfs mount: 186 MB\nFree memory before tmpfs mount: 798 MB\nMemory to use in integrity test (80% of free, min 64MB): 638 MB\n"}}
Executing command bash in sandbox il0k8oggtzjkpkrj9bkyf (user: root)
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"638+0 records in\n638+0 records out\n668991488 bytes (669 MB, 638 MiB) copied, 20.6529 s, 32.4 MB/s\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\tCommand bei"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ng timed: \"dd if=/dev/urandom of=/mnt/testfile bs=1M count=638\"\n\tUser t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ime (seconds): "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"0.00\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"\tSystem time"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" (seconds): 20.29\n\tPercent of CPU this job got: 97%\n\tElapsed (wall cloc"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"k) time (h:mm:ss or m:ss): 0:20.71\n\tAverage shared t"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ext size (kbytes): 0\n\tAverage unshared d"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ata size (k"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"bytes): 0\n\tAverage stack si"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"ze (kbytes): 0\n\tAverage total size (kbytes): 0\n\tMaximum "}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"resident s"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:"et size (kbytes): 2604\n\tAverage resident set"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stderr:" size (kbytes): 0\n\tMajor (requiring I/O) page faults: 2\n\tMinor (reclaiming a frame) page faults: 343\n\tVoluntary context switches: 3\n\tInvoluntary context switches: 113\n\tSwaps: 0\n\tFile system inputs: 176\n\tFile system outputs: 0\n\tSocket messages sent: 0\n\tSocket messages received: 0\n\tSignals delivered: 0\n\tPage size (bytes): 4096\n\tExit status: 0\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{data:{stdout:"Used memory after tmpfs mount and file fill: 832 MB\n"}}
    sandbox_memory_integrity_test.go:70: Command [bash] output: event:{end:{exited:true status:"exit status 0"}}
    sandbox_memory_integrity_test.go:70: Command [bash] completed successfully in sandbox i0xxor1tcz716oy57uiwa
Executing command bash in sandbox i0xxor1tcz716oy57uiwa (user: root)
    sandbox_memory_integrity_test.go:74: Command [bash] output: event:{start:{pid:1281}}
    sandbox_memory_integrity_test.go:74: Command [bash] output: event:{data:{stdout:"9167c138e9179cedaabdd7a92d68b9a60e6b84175a5ea1970ac0844a05075d1a\n"}}
    sandbox_memory_integrity_test.go:74: Command [bash] output: event:{end:{exited:true status:"exit status 0"}}
    sandbox_memory_integrity_test.go:74: Command [bash] completed successfully in sandbox i0xxor1tcz716oy57uiwa
Executing command bash in sandbox i0xxor1tcz716oy57uiwa (user: root)
    sandbox_memory_integrity_test.go:99: Command [bash] output: event:{start:{pid:1284}}
    sandbox_memory_integrity_test.go:99: Command [bash] output: event:{data:{stdout:"9167c138e9179cedaabdd7a92d68b9a60e6b84175a5ea1970ac0844a05075d1a\n"}}
    sandbox_memory_integrity_test.go:99: Command [bash] output: event:{end:{exited:true status:"exit status 0"}}
    sandbox_memory_integrity_test.go:99: Command [bash] completed successfully in sandbox i0xxor1tcz716oy57uiwa
    sandbox_memory_integrity_test.go:84: 
        	Error Trace:	.../tests/orchestrator/sandbox_memory_integrity_test.go:84
        	Error:      	Not equal: 
        	            	expected: 204
        	            	actual  : 409
        	Test:       	TestSandboxMemoryIntegrity/tmpfs_hash
--- FAIL: TestSandboxMemoryIntegrity/tmpfs_hash (50.64s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

I have no feedback to provide as there are no critical findings in this pull request.

@ValentaTomas ValentaTomas marked this pull request as ready for review May 8, 2026 00:37
@ValentaTomas ValentaTomas enabled auto-merge (squash) May 8, 2026 00:38
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward mirror of the existing build-firecracker-version flag.

Extended reasoning...

Overview

This PR adds a BuildKernelVersion LaunchDarkly string flag that mirrors the existing BuildFirecrackerVersion flag, lets the API handlers and orchestrator resolve the kernel version baked into new template builds via LD instead of from Config.DefaultKernelVersion/featureflags.DefaultKernelVersion, and introduces an opt-in NewStringFlagFallbackOnEmpty constructor so an empty LD string also falls through to the fallback. Five files touched; the diff is mechanical.

Security risks

None identified. No auth, crypto, permission, or input-validation surfaces are touched. The feature flag plumbing is additive and the new fallbackWhenEmpty field is opt-in (only BuildKernelVersion uses it), so behavior of every existing StringFlag is unchanged.

Level of scrutiny

Low–medium. The change influences which kernel image new builds get pinned to, but existing snapshots/builds keep their stored env_builds.kernel_version (set at build time and not re-read on resume), and the LD default is wired through the same DEFAULT_KERNEL_VERSION env var → in-tree DefaultKernelVersion constant chain that the previous code used. The shape is identical to the already-shipped BuildFirecrackerVersion flag.

Other factors

The only bug surfaced is a nit (the Config.DefaultKernelVersion field — already marked // Deprecated — and its post-parse default-fill block are now write-only) and is explicitly out of scope. The integration-tests CI failure shown in the timeline is a 'matrix result: cancelled' aggregate, not a real test failure. Recent commits (refactor(featureflags): bake empty-fallback into StringFlag opt-in, fix(featureflags): fall back to DefaultKernelVersion when LD flag is empty) show the author already iterated on the empty-fallback semantics in response to earlier feedback.

MemoryMB: body.MemoryMB,
Version: templates.TemplateV1Version,
KernelVersion: a.config.DefaultKernelVersion,
KernelVersion: kernelVersion,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: this PR removes the only consumers of Config.DefaultKernelVersion (both a.config.DefaultKernelVersion reads in this file and in template_request_build_v3.go are replaced with the BuildKernelVersion feature flag). That leaves the DefaultKernelVersion field in packages/api/internal/cfg/model.go (line 71, already marked // Deprecated) and its default-fill block (lines 141-142) as dead code — worth a follow-up cleanup, not blocking this PR.

Extended reasoning...

What's now dead

After this PR, the API's Config.DefaultKernelVersion field has no remaining readers:

  • packages/api/internal/handlers/deprecated_template_request_build.go:210 — was KernelVersion: a.config.DefaultKernelVersion, now KernelVersion: kernelVersion (resolved from the BuildKernelVersion feature flag).
  • packages/api/internal/handlers/template_request_build_v3.go:133 — same swap.

A grep for DefaultKernelVersion across packages/api after this PR returns only three hits, all inside packages/api/internal/cfg/model.go itself:

packages/api/internal/cfg/model.go:71:	DefaultKernelVersion string `env:"DEFAULT_KERNEL_VERSION"`
packages/api/internal/cfg/model.go:141:	if config.DefaultKernelVersion == "" {
packages/api/internal/cfg/model.go:142:		config.DefaultKernelVersion = featureflags.DefaultKernelVersion

The field already carries a // Deprecated: Template manager should use its own comment on line 70, suggesting the author was aware it was on its way out — this PR finishes that job at the call sites but leaves the struct field and its post-parse default-fill behind.

Why removal is safe

The DEFAULT_KERNEL_VERSION env var that this field reads is still functional, but it is now consumed independently in packages/shared/pkg/featureflags/flags.go:

BuildKernelVersion = NewStringFlagFallbackOnEmpty("build-kernel-version", env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion))

So removing the API config field does not change observable behavior: the env var still flows into the kernel-version chosen for new builds, just via the feature-flag fallback path instead of via the API Config struct.

Step-by-step proof

  1. Before this PR: API service starts → cfg.Parse() reads DEFAULT_KERNEL_VERSION env into config.DefaultKernelVersion (and falls back to featureflags.DefaultKernelVersion on empty). Build handlers read a.config.DefaultKernelVersion to populate RegisterBuildData.KernelVersion.
  2. After this PR: API service starts → same parse happens, but no handler reads the field. Build handlers call a.featureFlags.StringFlag(ctx, featureflags.BuildKernelVersion) instead. The BuildKernelVersion flag was constructed at package init from env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion), so the same env var still drives the value.
  3. Net effect: lines 71 and 141-142 of cfg/model.go are write-only — nothing reads them, and removing them has no functional impact.

Impact / fix

Zero runtime impact — purely cleanup. Fix is to delete the DefaultKernelVersion field from the Config struct and the corresponding if config.DefaultKernelVersion == "" { ... } block in Parse(). Out of scope for this PR (the file isn't in the diff), so flagging as a follow-up nit.

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 8, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. BuildFirecrackerVersion missing empty-string fallback 🐞
Description
The PR adds FallbackWhenEmpty protection to BuildKernelVersion but not to BuildFirecrackerVersion.
If LaunchDarkly evaluates build-firecracker-version to an empty string, the empty value propagates
to fcversion.New(""), which returns an error and aborts every template build.
Code

packages/shared/pkg/featureflags/flags.go[R278-279]

	BuildFirecrackerVersion     = NewStringFlag("build-firecracker-version", env.GetEnv("DEFAULT_FIRECRACKER_VERSION", DefaultFirecrackerVersion))
+	BuildKernelVersion          = NewStringFlagFallbackOnEmpty("build-kernel-version", env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion))
Evidence
Line 278 creates BuildFirecrackerVersion with NewStringFlag (fallbackWhenEmpty=false), while line
279 creates BuildKernelVersion with NewStringFlagFallbackOnEmpty (fallbackWhenEmpty=true). The
StringFlag method in client.go only applies the empty-string fallback when FallbackWhenEmpty() is
true, so an empty LD evaluation for the FC flag propagates unchecked. Both create_template.go files
call fcversion.New(firecrackerVersion) without a prior empty-string guard, and fcversion.New errors
on an empty/invalid version string.

packages/shared/pkg/featureflags/flags.go[278-279]
packages/shared/pkg/featureflags/client.go[112-119]
packages/orchestrator/pkg/template/server/create_template.go[60-65]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
BuildFirecrackerVersion is created with NewStringFlag, which does not fall back when LaunchDarkly returns an empty string. The PR introduces NewStringFlagFallbackOnEmpty for BuildKernelVersion but omits the same protection for BuildFirecrackerVersion. An empty LD evaluation propagates to fcversion.New(""), which errors and aborts all template builds.

## Issue Context
The StringFlag.FallbackWhenEmpty mechanism was added in this PR specifically to handle the case where a flag is defined in LD without a value. Both kernel and firecracker version flags share the same risk profile, but only the kernel flag opts in.

## Fix Focus Areas
- packages/shared/pkg/featureflags/flags.go[278-278]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines +278 to +279
BuildFirecrackerVersion = NewStringFlag("build-firecracker-version", env.GetEnv("DEFAULT_FIRECRACKER_VERSION", DefaultFirecrackerVersion))
BuildKernelVersion = NewStringFlagFallbackOnEmpty("build-kernel-version", env.GetEnv("DEFAULT_KERNEL_VERSION", DefaultKernelVersion))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Buildfirecrackerversion missing empty-string fallback 🐞 Bug ☼ Reliability

The PR adds FallbackWhenEmpty protection to BuildKernelVersion but not to BuildFirecrackerVersion.
If LaunchDarkly evaluates build-firecracker-version to an empty string, the empty value propagates
to fcversion.New(""), which returns an error and aborts every template build.
Agent Prompt
## Issue description
BuildFirecrackerVersion is created with NewStringFlag, which does not fall back when LaunchDarkly returns an empty string. The PR introduces NewStringFlagFallbackOnEmpty for BuildKernelVersion but omits the same protection for BuildFirecrackerVersion. An empty LD evaluation propagates to fcversion.New(""), which errors and aborts all template builds.

## Issue Context
The StringFlag.FallbackWhenEmpty mechanism was added in this PR specifically to handle the case where a flag is defined in LD without a value. Both kernel and firecracker version flags share the same risk profile, but only the kernel flag opts in.

## Fix Focus Areas
- packages/shared/pkg/featureflags/flags.go[278-278]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty-string fallback mechanism (and NewStringFlagFallbackOnEmpty) was reverted in c9309f1, so BuildKernelVersion and BuildFirecrackerVersion now have identical behavior — both will propagate an empty LD value. This was an explicit decision to keep the kernel flag consistent with the FC flag.

@qodo-code-review
Copy link
Copy Markdown

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: integration-tests / integration_tests

Failed stage: Aggregate matrix result [❌]

Failed test name: ""

Failure summary:

The action failed because the matrix job result was cancelled instead of success.
- A guard step
checks if [[ "cancelled" != "success" ]]; then ... exit 1, which triggered and exited with code 1.
-
No specific test failure is shown in this log; the workflow failed due to the cancelled matrix
shard/job.

Relevant error logs:
1:  ##[group]Runner Image Provisioner
2:  Hosted Compute Agent
...

25:  Prepare all required actions
26:  Uses: e2b-dev/infra/.github/workflows/integration_tests.yml@refs/pull/2592/merge (b3ed17baa2341c289c6e5d1d274a9c105a184b80)
27:  ##[group] Inputs
28:  publish: true
29:  ##[endgroup]
30:  Complete job name: integration-tests / integration_tests
31:  ##[group]Run if [[ "cancelled" != "success" ]]; then
32:  �[36;1mif [[ "cancelled" != "success" ]]; then�[0m
33:  �[36;1m  echo "matrix result: cancelled"�[0m
34:  �[36;1m  exit 1�[0m
35:  �[36;1mfi�[0m
36:  �[36;1mecho "all matrix shards succeeded"�[0m
37:  shell: /usr/bin/bash -e {0}
38:  ##[endgroup]
39:  matrix result: cancelled
40:  ##[error]Process completed with exit code 1.
41:  Cleaning up orphan processes

@ValentaTomas ValentaTomas disabled auto-merge May 8, 2026 00:57
@ValentaTomas ValentaTomas enabled auto-merge (squash) May 8, 2026 01:00
@ValentaTomas ValentaTomas merged commit cec0e81 into main May 8, 2026
49 checks passed
@ValentaTomas ValentaTomas deleted the feat/build-kernel-version-flag branch May 8, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants