Skip to content

fix(sqlite): fence remote requests by generation#5014

Draft
NathanFlurry wants to merge 1 commit into
counter-latency/core-metrics-endpointfrom
counter-latency/sqlite-generation-fence
Draft

fix(sqlite): fence remote requests by generation#5014
NathanFlurry wants to merge 1 commit into
counter-latency/core-metrics-endpointfrom
counter-latency/sqlite-generation-fence

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 11, 2026

🚅 Deployed to the rivet-pr-5014 environment in rivet-frontend

Service Status Web Updated (UTC)
kitchen-sink ❌ Build Failed (View Logs) Web May 14, 2026 at 5:27 pm
website 😴 Sleeping (View Logs) Web May 14, 2026 at 8:04 am
frontend-cloud ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
ladle ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
mcp-hub ✅ Success (View Logs) Web May 11, 2026 at 12:06 pm

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

Code Review: fix(sqlite): fence remote requests by generation

Overview

This PR adds generation fencing to remote SQLite get_pages and commit requests by:

  1. Introducing a GenerationFencedTransport decorator that stamps expected_generation on every outgoing request.
  2. Routing requests with a generation to validate_remote_sqlite_generation instead of the older validate_sqlite_actor on the envoy side.
  3. Removing the expected_generation.is_none() guard from is_startup_database_miss (needed because generation is now always present).
  4. Adding "actor does not exist" to is_initial_main_page_missing in the VFS to treat a not-yet-registered actor as a fresh database.

Issues

1. Potential production failure

In vfs.rs:

|| message == "actor does not exist"

This string originates from bail!("actor does not exist") in validate_remote_sqlite_generation. That error is a plain anyhow error (not a SqliteStorageError), so depot_error(err) returns None and the error goes through the plain-anyhow path of rivet_error::RivetError::extract. Per CLAUDE.md, build_internal sanitizes raw messages in production.

The two existing checks are caught by depot_error as structured SqliteStorageError variants and survive serialization intact. This one does not.

If this check silently stops matching in production, new actors will fail to initialize their SQLite database on startup.

Recommended fix: Add a dedicated SqliteStorageError variant (e.g. ActorGenerationNotFound) so the condition is caught by depot_error on the client side and survives error serialization intact.


2. Dead parameter: _expected_generation should be removed

is_startup_database_miss now ignores the first parameter but still accepts it. All callers still pass it. Drop the parameter and update call sites.


3. validate_sqlite_actor_for_request skips get_for_kv

When expected_generation is Some, only validate_remote_sqlite_generation is called. The original path also calls get_for_kv and checks actor.namespace_id == conn.namespace_id. Namespace safety is maintained via the FDB key, but the difference is worth a comment for future readers.


4. No tests

A driver test covering the new-actor-opens-SQLite path would be valuable before merging.


Minor notes

  • get_or_insert(self.generation) correctly preserves any already-set generation.
  • The decorator pattern for GenerationFencedTransport is clean and well-scoped.
  • The is_startup_database_miss logic change is correct.

Summary

The approach is sound. The main blocker before merging is issue 1: the plain-anyhow string is likely sanitized in production, so the VFS will silently fail to recognize a valid startup-database-miss. The fix is to promote this condition to a proper SqliteStorageError variant.

@MasterPtato MasterPtato force-pushed the counter-latency/core-metrics-endpoint branch from 87b147e to f21c5de Compare May 14, 2026 17:26
@MasterPtato MasterPtato force-pushed the counter-latency/sqlite-generation-fence branch from 9de6238 to 71dd250 Compare May 14, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant