feat(migrations): add backfill command for message parts and update s…#3613
Open
pedrofrxncx wants to merge 6 commits into
Open
feat(migrations): add backfill command for message parts and update s…#3613pedrofrxncx wants to merge 6 commits into
pedrofrxncx wants to merge 6 commits into
Conversation
Contributor
🧪 BenchmarkShould we run the Virtual MCP strategy benchmark for this PR? React with 👍 to run the benchmark.
Benchmark will run on the next push after you react. |
Contributor
Release OptionsSuggested: Minor ( React with an emoji to override the release type:
Current version:
|
…torage logic - Introduced a new migration (098) to handle the backfill of `thread_messages.parts` into the new `message_parts` table. - Added a CLI command `backfill-message-parts` to facilitate copying and reconciling existing message parts. - Updated `SqlThreadStorage` to dual-write message parts during message saves, ensuring consistency between `thread_messages` and `message_parts`. - Defined a new `MessagePartsTable` interface to represent the structure of the normalized message parts in the database.
60c4cb1 to
ffed030
Compare
Contributor
There was a problem hiding this comment.
3 issues found across 6 files
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
…prove documentation - Updated the default batch size for the backfill-message-parts command to 200. - Enhanced comments and documentation for clarity on command usage and performance considerations. - Introduced utility functions for better handling of message parts and logging progress during execution. - Improved memory management by ensuring that each message's write operation is wrapped in its own transaction.
…tion - Updated the logic for inserting and deleting message parts in the `message_parts` table to improve efficiency and maintain consistency. - Simplified the upsert operation by using a single multi-row insert followed by a combined delete for trailing rows. - Enhanced comments for better clarity on the migration process and transaction handling.
Contributor
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
- Introduced a constant to limit the number of rows inserted into the `message_parts` table, ensuring compliance with Postgres' bind-parameter ceiling. - Updated the insertion logic in `SqlThreadStorage` to chunk message parts into manageable sizes during batch inserts, enhancing performance and stability.
- Added `created_at` and `updated_at` columns to the `message_parts` table to track part-level timestamps. - Updated the backfill logic to inherit the message's `created_at` timestamp for backfilled parts. - Improved the insertion logic in `SqlThreadStorage` to maintain the integrity of timestamps during updates. - Enhanced comments for clarity on the handling of timestamps and dual-write behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…torage logic
thread_messages.partsinto the newmessage_partstable.backfill-message-partsto facilitate copying and reconciling existing message parts.SqlThreadStorageto dual-write message parts during message saves, ensuring consistency betweenthread_messagesandmessage_parts.MessagePartsTableinterface to represent the structure of the normalized message parts in the database.What is this contribution about?
Screenshots/Demonstration
How to Test
Migration Notes
Review Checklist
Summary by cubic
Normalizes
thread_messages.partsinto a newmessage_partstable and adds adeco backfill-message-partsCLI to copy and reconcile existing data. Dual-write inSqlThreadStorageuses chunked upserts and minimal locking, with accurate per-part timestamps.New Features
message_partswith PK(message_id, idx), JSON-textcontent,type, and per-partcreated_at/updated_at.deco backfill-message-partswith--reconcile,--dry-run,--batch(default 200),--limit,--after-id; keyset pagination; per-message transactions; chunked inserts; idempotent copy and atomic reconcile. Backfilled parts inherit the message’screated_at.created_atand bumpsupdated_atonly whencontentchanges (skips no-op rewrites).MessagePartsTabletype.Migration
deco backfill-message-partsto copy existing parts; resume with--after-idif needed.deco backfill-message-parts --reconcilebefore switching reads tomessage_partsto verify/repair drift.Written for commit 72237f6. Summary will update on new commits.