Skip to content

feat(migrations): add backfill command for message parts and update s…#3613

Open
pedrofrxncx wants to merge 6 commits into
mainfrom
feat/message-parts
Open

feat(migrations): add backfill command for message parts and update s…#3613
pedrofrxncx wants to merge 6 commits into
mainfrom
feat/message-parts

Conversation

@pedrofrxncx
Copy link
Copy Markdown
Collaborator

@pedrofrxncx pedrofrxncx commented Jun 1, 2026

…torage logic

  • Introduced a new migration (098) to handle the backfill of thread_messages.parts into the new message_parts table.
  • Added a CLI command backfill-message-parts to facilitate copying and reconciling existing message parts.
  • Updated SqlThreadStorage to dual-write message parts during message saves, ensuring consistency between thread_messages and message_parts.
  • Defined a new MessagePartsTable interface to represent the structure of the normalized message parts in the database.

What is this contribution about?

Describe your changes and why they're needed.

Screenshots/Demonstration

Add screenshots or a Loom video if your changes affect the UI.

How to Test

Provide step-by-step instructions for reviewers to test your changes:

  1. Step one
  2. Step two
  3. Expected outcome

Migration Notes

If this PR requires database migrations, configuration changes, or other setup steps, document them here. Remove this section if not applicable.

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • Documentation is updated (if needed)
  • No breaking changes

Summary by cubic

Normalizes thread_messages.parts into a new message_parts table and adds a deco backfill-message-parts CLI to copy and reconcile existing data. Dual-write in SqlThreadStorage uses chunked upserts and minimal locking, with accurate per-part timestamps.

  • New Features

    • Migration 098 creates message_parts with PK (message_id, idx), JSON-text content, type, and per-part created_at/updated_at.
    • CLI: deco backfill-message-parts with --reconcile, --dry-run, --batch (default 200), --limit, --after-id; keyset pagination; per-message transactions; chunked inserts; idempotent copy and atomic reconcile. Backfilled parts inherit the message’s created_at.
    • Dual-write on save: one chunked multi-row upsert then one combined delete of trailing rows; preserves created_at and bumps updated_at only when content changes (skips no-op rewrites).
    • Adds MessagePartsTable type.
  • Migration

    • Deploy migration 098.
    • Run deco backfill-message-parts to copy existing parts; resume with --after-id if needed.
    • Run deco backfill-message-parts --reconcile before switching reads to message_parts to verify/repair drift.

Written for commit 72237f6. Summary will update on new commits.

Review in cubic

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Release Options

Suggested: Minor (2.381.0) — based on feat: prefix

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.380.6-alpha.1
🎉 Patch 2.380.6
❤️ Minor 2.381.0
🚀 Major 3.0.0

Current version: 2.380.5

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

…torage logic

- Introduced a new migration (098) to handle the backfill of `thread_messages.parts` into the new `message_parts` table.
- Added a CLI command `backfill-message-parts` to facilitate copying and reconciling existing message parts.
- Updated `SqlThreadStorage` to dual-write message parts during message saves, ensuring consistency between `thread_messages` and `message_parts`.
- Defined a new `MessagePartsTable` interface to represent the structure of the normalized message parts in the database.
@pedrofrxncx pedrofrxncx force-pushed the feat/message-parts branch from 60c4cb1 to ffed030 Compare June 1, 2026 20:22
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 6 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread apps/mesh/src/cli.ts Outdated
Comment thread apps/mesh/src/storage/threads.ts Outdated
Comment thread apps/mesh/src/cli/commands/backfill-message-parts.ts Outdated
…prove documentation

- Updated the default batch size for the backfill-message-parts command to 200.
- Enhanced comments and documentation for clarity on command usage and performance considerations.
- Introduced utility functions for better handling of message parts and logging progress during execution.
- Improved memory management by ensuring that each message's write operation is wrapped in its own transaction.
…tion

- Updated the logic for inserting and deleting message parts in the `message_parts` table to improve efficiency and maintain consistency.
- Simplified the upsert operation by using a single multi-row insert followed by a combined delete for trailing rows.
- Enhanced comments for better clarity on the migration process and transaction handling.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread apps/mesh/src/storage/threads.ts Outdated
- Introduced a constant to limit the number of rows inserted into the `message_parts` table, ensuring compliance with Postgres' bind-parameter ceiling.
- Updated the insertion logic in `SqlThreadStorage` to chunk message parts into manageable sizes during batch inserts, enhancing performance and stability.
- Added `created_at` and `updated_at` columns to the `message_parts` table to track part-level timestamps.
- Updated the backfill logic to inherit the message's `created_at` timestamp for backfilled parts.
- Improved the insertion logic in `SqlThreadStorage` to maintain the integrity of timestamps during updates.
- Enhanced comments for clarity on the handling of timestamps and dual-write behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant