Skip to content

test(rivetkit): cover metrics endpoint#5009

Closed
NathanFlurry wants to merge 1 commit into
sqlite-vfs/fatal-only-fence-mismatchfrom
metrics/endpoint-sanity-test
Closed

test(rivetkit): cover metrics endpoint#5009
NathanFlurry wants to merge 1 commit into
sqlite-vfs/fatal-only-fence-mismatchfrom
metrics/endpoint-sanity-test

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 9, 2026

🚅 Deployed to the rivet-pr-5009 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web May 9, 2026 at 6:40 am
frontend-inspector ❌ Build Failed (View Logs) Web May 9, 2026 at 6:29 am
ladle ❌ Build Failed (View Logs) Web May 9, 2026 at 6:29 am
frontend-cloud ❌ Build Failed (View Logs) Web May 9, 2026 at 6:29 am
mcp-hub ✅ Success (View Logs) Web May 9, 2026 at 6:28 am
kitchen-sink ❌ Build Failed (View Logs) Web May 9, 2026 at 6:28 am

@railway-app railway-app Bot temporarily deployed to rivet-frontend / rivet-pr-5009 May 9, 2026 06:28 Destroyed
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 9, 2026

Code Review: test(rivetkit): cover metrics endpoint

Overview

This PR adds test coverage for the Prometheus metrics endpoint by:

  1. Extending SharedTestEngine with a metricsEndpoint field
  2. Allocating a dedicated metricsPort and wiring it via RIVET__METRICS__PORT
  3. Adding isMetricsEndpointHealthy for reuse-validation of cached engine state
  4. Adding a new metrics-endpoint.test.ts that fetches the endpoint and validates metric presence

The implementation is clean and follows existing patterns well. A few items worth considering:


Issues / Suggestions

1. No startup wait for the metrics endpoint (potential flakiness)

waitForEngineHealth only polls the guard endpoint (/health). After it returns, the engine is considered ready, but the metrics server (a separate Hyper server listening on its own port) might not be bound yet. If the metrics server starts slightly behind the guard in practice this won't matter, but there is no hard guarantee — the test hits the metrics endpoint immediately after getOrStartSharedTestEngine resolves.

Consider extending spawnSharedEngine to also wait for the metrics endpoint after the guard health check passes:

await waitForEngineHealth(engine, logs, endpoint, 90_000);
// Also wait for the metrics endpoint to be ready.
await waitForMetricsHealth(engine, logs, metricsEndpoint, 10_000);

2. Asymmetric value assertions in the test

The test checks the actual numeric value for rivet_tokio_thread_count but only checks # HELP / # TYPE lines for rivet_tokio_task_total with no assertion that the counter value line itself appears. A counter at zero is still emitted with value 0, so /^rivet_tokio_task_total \d+$/m would be a safe addition and make the two metric assertions symmetric.

3. isMetricsEndpointHealthy duplicates isEngineHealthy logic

Both functions do exactly fetch(url) + return response.ok. The only difference is the /health suffix in isEngineHealthy. Consolidating into a shared helper is optional given the small size, but worth noting for future maintainers.

4. Backward-compatibility guard is correct

typeof existing.metricsEndpoint === "string" correctly handles cached engine state files written before this PR — they'll lack metricsEndpoint and the engine will be respawned cleanly. Good defensive check.

5. Port exclusion list is correct

metricsPort is excluded when allocating later ports, and guardPort/apiPeerPort are excluded when allocating metricsPort. No issues here.


Verdict

Small, well-scoped addition that follows the established test-infrastructure patterns. The only concrete concern is item 1 (startup race for the metrics port) — the same class of startup sequencing that waitForEngineHealth was introduced to solve applies to the metrics server too. Worth a small fix before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant