Skip to content

feat(connections): private connections + typed agent slots#3518

Open
tlgimenes wants to merge 101 commits into
mainfrom
tlgimenes/list-github-tools
Open

feat(connections): private connections + typed agent slots#3518
tlgimenes wants to merge 101 commits into
mainfrom
tlgimenes/list-github-tools

Conversation

@tlgimenes
Copy link
Copy Markdown
Contributor

@tlgimenes tlgimenes commented May 28, 2026

Summary

  • Per-user private connections: every new mcp_connection defaults to access='user' (private to its creator); only the creator + org-shared rows are visible to other tools/UI.
  • Typed slots on agents: connection_aggregations.slot_app_id lets an agent declare "I need an mcp-github" without binding to a specific connection. At runtime the slot resolves to the calling user's connection of that shape (or an org-shared one as fallback). Triggers/automations resolve via the automation's created_by.
  • GitHub import bug fix: the "Import from GitHub" picker no longer leaks teammates' personal GitHub accounts. It now calls a new app-only CONNECTION_RESOLVE_FOR_USER tool that returns only the caller's own mcp-github connection.

Why

/user/installations on GitHub returns installations the authenticated user has any read access to — including teammates' personal accounts when collaborators are shared. Because Studio kept a single OAuth token per connection (migration 017 explicitly consolidated to this), opening the picker as one teammate showed installations belonging to another. This PR fixes that at the source: each user mints their own private connection and the resolver hands the picker the right one.

What landed

Phase Commit What
1 f97d1201e Migration 097: connections.access + backfill, connection_aggregations.slot_app_id + XOR CHECK, R4 partial unique index. Schema-only.
2 f3bb9e6f0 Surface slots array on VirtualMCPEntity (+ fix Phase 1 RawAggregationRow drift).
2 59403a0bb Fix VirtualMCPStorage.update preserving the untouched field across single-field updates.
2 6e65b4bb0 apps/mesh/src/core/slot-resolver.tsresolveSlot, SlotResolutionCache, SlotUnresolvedError.
2 d12fc9059 Wire resolver into createVirtualClientFrom, emit OTel span attributes.
3 d79ddec9f ConnectionEntity.access field + ConnectionStorage.list/findById take a viewerUserId and hide other users' private rows.
3 bf9c6ef1e CONNECTION_RESOLVE_FOR_USER app-only tool.
4 63e5cadd1 github-repo-picker and add-storefront-modal use the resolver instead of useConnections ambiguity.

Test plan

  • Unit + integration tests pass (~170 added/touched cases across migrations, storage, resolver, virtual-MCP client, connection tools).
  • bun run check clean across all workspaces.
  • bun run fmt:check clean.
  • Manual browser test: opened "Import from GitHub" → new mcp-github connection created with access='user' in DB. Seeded a second teammate's private mcp-github connection in the same org and verified from the authenticated session that both CONNECTION_RESOLVE_FOR_USER and COLLECTION_CONNECTIONS_LIST return only the calling user's row.
  • OAuth handoff to github.com → full end-to-end install (requires real GitHub login; out of scope for the automated check).

Known follow-ups (non-blocking)

  • Phase 3b UI: connection list tabs (My / Org-shared), 🔒/🌐 row badges, promote/demote modal that deletes the downstream token on demote.
  • Phase 5: org-admin nudge banner listing connections still on access='org'.
  • Polish: OTel attribute key cardinality in createVirtualClientFrom (currently embeds app_id in the key — switch to span events or stable keys); extract a partitionAggregations helper to dedupe the partition loop between deserializeVirtualMCPEntity and update; extract a resolveSlotsForAgent helper to slim down the inlined ~70 LOC in createVirtualClientFrom.
  • Latent: RawAggregationRow.child_connection_id is honest about nullability now; the local-type cast at virtual.ts:50 for the raw selectAll() results survives an extra as RawAggregationRow[] cast — safe today, worth tightening when slot-aware read paths multiply.

🤖 Generated with Claude Code


Update — Just-in-time connection gate for agents & subagents (commits 29b5b78d839c624ee7)

The earlier slot work gated the parent agent up front (block the composer until the user's slots resolve). But a parent can delegate to any agent via the subtask tool, and those subagents' slots were never gated — so they blew up mid-conversation with a generic "Subtask failed". This follow-up replaces the up-front gate with one runtime mechanism shared by both flows.

Design/Plan: docs/superpowers/specs/2026-05-31-just-in-time-connection-gate-design.md, docs/superpowers/plans/2026-05-31-just-in-time-connection-gate.md.

What changed

  • SlotUnresolvedError now collects all missing app_ids + agent identity; createVirtualClientFrom resolves every slot and throws once.
  • Both boundaries catch it and emit a single data-connect-required stream chunk: the subagent boundary (subtask.ts, keyed by toolCallId) and the parent boundary (decopilot harness index.ts). The model also gets plain-text so it doesn't blindly retry.
  • Frontend: new ConnectCard (reuses ConnectSlotRow + useSlotAppDisplays; Retry re-runs the last turn) rendered inline as a visible part; the up-front ConnectAgentGate + useUnresolvedSlots are removed.
  • Critical fix (found in adversarial review): the parent path's bare return emitted no finish chunk → the client hung in "streaming" forever and the run logged as failed. It now emits a well-formed start → text → data-connect-requiredfinish envelope, and resolveThreadStatus maps a connect-required part to requires_action (no false failure / failure sound). Also fixed a ConnectCard crash in the read-only monitoring route (useOptionalChatStream).

Test plan

  • Unit: SlotUnresolvedError collect-all + agent identity; subtask SlotUnresolvedError catch path; resolveThreadStatus connect-required → requires_action. bun test green (44 cases across the touched files).
  • bun run check clean (all workspaces); bun run fmt:check clean; lint 0 errors; knip clean.
  • Whole-branch adversarial review (multi-agent) + independent re-verification of the critical finish-envelope path against the AI SDK internals.
  • E2E (apps/mesh/e2e/tests/connect-card.spec.ts): deterministic parent-gate path (slot unresolved → data-connect-required part + finish chunk + clean requires_action status) — passes. The subagent path depends on the real LLM choosing subtask (non-deterministic) and is covered by the subtask.ts unit test instead.

🤖 Generated with Claude Code


Summary by cubic

Adds per-user private connections and typed agent slots with runtime resolution and an inline Connect card for agents and subagents. Fixes the GitHub import privacy leak and stuck installs by resolving the caller’s own GitHub connection by app_id; also merges latest main and links @decocms/std.

  • New Features

    • Runtime slot resolution for Virtual MCPs via resolveSlot with SlotResolutionCache; SlotUnresolvedError now collects all missing app_ids and agent identity.
    • Just‑in‑time connect: unresolved slots emit a data-connect-required part rendered as a chat ConnectCard with registry icon/name and inline OAuth (connectApp + useConnectApp + useSlotAppDisplays); thread status is requires_action. Subagents (via subtask) use the same flow.
    • Connections UI: settings and AddConnectionDialog split into All/Shared/Personal tabs; lightweight per‑item enable/disable toggle replaces the dependency dialog; “Import from GitHub” always shown.
    • Private-by-default connections (connections.access); ConnectionStorage.findById/list require a viewer and support INTERNAL_VIEWER. Centralized token save (persistDownstreamToken) and OAuth identity decoration (oauth-identity) update titles like “Gmail (alice@acme.com)”.
    • Agents/data/infra: Virtual MCPs are org‑scoped; private connections can’t attach as concrete children; connectionAttachTarget auto‑picks slot vs child. Migrations 097–098 add connections.access, connection_aggregations.slot_app_id, flip existing rows to org access, and backfill/stabilize app_ids via deriveAppId. New tool CONNECTION_RESOLVE_FOR_USER; hooks useResolveConnectionForUser; constant GITHUB_APP_ID.
  • Bug Fixes

    • GitHub import resolves by app_id (GITHUB_APP_ID) through CONNECTION_RESOLVE_FOR_USER and attaches as a typed slot, fixing cross‑user leaks and the stuck picker.
    • Parent/subagent connect-required paths emit a well‑formed start/text/data-connect-required/finish envelope; resolveThreadStatus maps to requires_action. Standardized @/ imports keep instanceof checks stable under hot reload.
    • Cross-user safety: all routes/tools pass a viewer to connections.findById/list; trusted infra uses INTERNAL_VIEWER.

Written for commit 649c4f8. Summary will update on new commits.

Review in cubic

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

Release Options

Suggested: Minor (2.376.0) — based on feat: prefix

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.375.3-alpha.1
🎉 Patch 2.375.3
❤️ Minor 2.376.0
🚀 Major 3.0.0

Current version: 2.375.2

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/add-storefront-modal.tsx">

<violation number="1" location="apps/mesh/src/web/components/add-storefront-modal.tsx:246">
P2: Duplicate connection resolver query logic in BrowseFlow and UrlEntry; extract a shared hook to reduce drift risk in this privacy-sensitive path.</violation>

<violation number="2" location="apps/mesh/src/web/components/add-storefront-modal.tsx:302">
P1: BrowseFlow can get stuck in a perpetual "installing" UI when CONNECTION_RESOLVE_FOR_USER fails because resolver error states are not handled.</violation>
</file>

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Re-trigger cubic

Comment thread apps/mesh/src/storage/connection.ts Outdated
@@ -20,19 +20,17 @@ import {
} from "@deco/ui/components/dialog.tsx";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: BrowseFlow can get stuck in a perpetual "installing" UI when CONNECTION_RESOLVE_FOR_USER fails because resolver error states are not handled.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/add-storefront-modal.tsx, line 302:

<comment>BrowseFlow can get stuck in a perpetual "installing" UI when CONNECTION_RESOLVE_FOR_USER fails because resolver error states are not handled.</comment>

<file context>
@@ -279,7 +299,7 @@ function BrowseFlow({
   }
 
-  if (githubConnections.length === 0 && autoInstall.status === "idle") {
+  if (resolvedConnectionId === null) {
     return (
       <AutoInstallGitHubUI
</file context>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a real issue (and cubic mislabeled the file — the diff_hunk actually targets apps/mesh/src/web/components/github-repo-picker.tsx, not add-storefront-modal.tsx which does not exist in this repo).

In PickerContent the resolver error state is handled explicitly before the "installing" fallback:

if (resolveQuery.isError) {
  return (
    <AutoInstallGitHubUI
      status="error"
      error={resolveQuery.error instanceof Error ? resolveQuery.error.message : "Failed to resolve GitHub connection."}
      retry={() => { resolveQuery.refetch(); }}
    />
  );
}

if (resolveQuery.isLoading || resolvedConnectionId === null) {
  return <AutoInstallGitHubUI status="installing" ... />;
}

InstallationPicker does the same (lines 529-543). On resolver failure the user sees the "Connection failed" UI with a Try again button — no perpetual "installing" state. Additionally, useAutoInstallGitHub is gated on resolveQuery.isSuccess && resolvedConnectionId === null, so it never fires on resolver error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

Comment thread apps/mesh/src/storage/virtual.ts Outdated
Comment thread apps/mesh/src/web/components/github-repo-picker.tsx Outdated
Comment thread packages/mesh-sdk/src/types/virtual-mcp.ts
Comment thread apps/mesh/src/tools/connection/delete.ts
// Resolve the caller's own mcp-github connection (user-private preferred,
// org-shared fallback). Avoids the cross-user leak in
// useConnections({ slug: 'mcp-github' }).
const resolveQuery = useQuery({
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Duplicate connection resolver query logic in BrowseFlow and UrlEntry; extract a shared hook to reduce drift risk in this privacy-sensitive path.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/add-storefront-modal.tsx, line 246:

<comment>Duplicate connection resolver query logic in BrowseFlow and UrlEntry; extract a shared hook to reduce drift risk in this privacy-sensitive path.</comment>

<file context>
@@ -239,17 +234,42 @@ function BrowseFlow({
+  // Resolve the caller's own mcp-github connection (user-private preferred,
+  // org-shared fallback). Avoids the cross-user leak in
+  // useConnections({ slug: 'mcp-github' }).
+  const resolveQuery = useQuery({
+    queryKey: KEYS.connectionResolveForUser(org.id, "mcp-github"),
+    queryFn: async () => {
</file context>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No-op here: apps/mesh/src/web/components/add-storefront-modal.tsx doesn't exist in this PR or the codebase. The two CONNECTION_RESOLVE_FOR_USER useQuery blocks live in github-repo-picker.tsx (lines 189 and 483), which is being handled by the sibling comment 3314944426.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

Comment thread apps/mesh/src/tools/connection/resolve-for-user.ts Outdated
@tlgimenes tlgimenes force-pushed the tlgimenes/list-github-tools branch 8 times, most recently from 2cef9bf to 3c51820 Compare May 30, 2026 12:17
tlgimenes and others added 19 commits May 30, 2026 09:58
Spec for per-user connection ownership with typed slot resolution in
agents. Replaces the single-token-per-connection model that causes
cross-contaminated GitHub installations in the import picker. New
connections default to user-private; existing rows backfill to
org-shared.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implementation plan for the schema-only first PR: connections.access
column with M4 backfill, connection_aggregations.slot_app_id + XOR
CHECK, partial unique indexes for R4 and slot uniqueness, plus Kysely
type updates. Two tasks (migration + regression check), TDD-style.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds connections.access ('user' | 'org'; existing rows → 'org', new
rows → 'user') and connection_aggregations.slot_app_id (with relaxed
child_connection_id and XOR CHECK). Adds partial unique indexes for
R4 (one user-private connection per app_id per user) and slot
uniqueness per agent.

Schema-only — no runtime behavior change. Foundation for the
private-connections design (docs/superpowers/specs/2026-05-27-private-connections-design.md).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implementation plan for slot resolution at runtime. Four tasks:
(1) surface slot rows on VirtualMCPEntity + clean up Phase 1 type
drift, (2) implement the pure-function resolver + cache + typed error,
(3) wire into createVirtualClient with OTel attributes, (4) regression
sweep. Trigger/automation T1 wiring is already in place via
meshContextFactory — no extra code needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds VirtualMCPSlot to the mesh-sdk type and a parallel `slots` array
on VirtualMCPEntity / Create / Update inputs. VirtualMCPStorage
partitions connection_aggregations rows into concrete `connections`
(child_connection_id set) and `slots` (slot_app_id set), using the
XOR CHECK from migration 097. Also fixes the Phase 1 follow-ups:
RawAggregationRow.child_connection_id is now properly typed as
string | null, and the deserialization read path no longer treats
null as a concrete connection_id.

The update path now preserves the un-touched field (loadExistingConnections /
loadExistingSlots) so a caller that updates only connections does not
wipe slots, and vice versa. Slot inserts ride the same direct-row
delete/re-insert cycle as concrete connections.

Test fixtures (passthrough-client.test, github-repo.test) and the
well-known agent helper (defineWellKnownAgentVMCP) updated to
include `slots: []` where required by the new entity shape.

Storage layer only — runtime resolution lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous implementation tried to preserve slots when only
connections were updated (and vice versa) via loadExistingSlots /
loadExistingConnections helpers — but these helpers read from
connection_aggregations AFTER the DELETE had already wiped the rows,
so they always returned []. Update with only one field silently
dropped the other.

Fix: expand the pre-DELETE currentAggs snapshot to .selectAll() and
partition it client-side into existingConnections / existingSlots.
Use those snapshots as the fallback when the corresponding field is
omitted from the input.

Adds two tests covering the single-field-update path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure-function resolver that translates a typed slot
(organizationId, invokerUserId, app_id) into a concrete connection_id:
prefers caller's user-private connection, falls back to org-shared,
returns null when neither matches. Includes SlotUnresolvedError for
callers that want to throw, and SlotResolutionCache for per-run
deduplication.

No runtime caller yet — wiring lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
createVirtualClient now resolves every slot on the agent before
constructing the PassthroughClient. Slot resolution uses
ctx.auth.user.id (or ctx.auth.apiKey.userId) as the invoker. Resolved
connections are appended to the passthrough's children, so downstream
tool dispatch is unchanged. Unresolved slots throw SlotUnresolvedError
carrying the missing app_id so the UI can prompt the user to connect.

OpenTelemetry span attributes (slot.<app>.connection_id,
slot.<app>.access) are emitted per resolution for debuggability.

Trigger/automation runs already pass automation.created_by as the
identity into meshContextFactory (dbos-workflow.ts:171), so the T1
rule (slot resolves to trigger owner) requires no extra wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er fix

Tight cut focused on landing the motivating bug fix. Adds access to
the tool surface and storage list filtering (Phase 3a), then
CONNECTION_RESOLVE_FOR_USER and the github-repo-picker refactor
(Phase 4). Tabs/badges/promote-demote UI (Phase 3b) and the admin
banner (Phase 5) are deferred to a follow-up plan — not required for
the demo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… filtering

ConnectionEntitySchema now has access ('user' | 'org', default 'user').
ConnectionStorage.list and findById accept a viewerUserId; user-private
connections are hidden from other users. CONNECTION_CREATE and
CONNECTION_UPDATE accept access via the existing partial schemas.

Backend foundation for the GitHub-import refactor — the picker can now
trust that 'connections this user can see' excludes teammates' private
GitHub accounts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thin wrapper around the Phase 2 slot resolver that lets UI surfaces
fetch the caller's user-private connection for a given app_id (with
org-shared fallback). Powers the upcoming GitHub-import picker
refactor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 'Import from GitHub' picker now resolves the caller's own
mcp-github connection via the new app-only resolver tool, instead of
picking ambiguously from useConnections({slug:'mcp-github'}). Each
user sees only their own GitHub installations — the cross-contamination
bug where teammates' personal accounts appeared in the picker is now
fixed at the source.

New mcp-github connections are created with the DB default
access='user' (private to creator) so this fix takes effect without
any explicit caller-side change.

The same refactor is applied to add-storefront-modal (BrowseFlow and
UrlEntry), which had a parallel ambiguous-connection bug. The
multi-connection picker UI is removed: the resolver makes the choice
unambiguous, and slot semantics keep it that way going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The spec and phase plans were working documents that drove the
implementation; the change itself is captured by the feature commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Make ConnectionViewer required (string | null | INTERNAL_VIEWER symbol)
  on findById/list to close the fail-open per-user visibility gap
- Extract partitionAggregations helper in virtual.ts storage
- Extract useResolveConnectionForUser hook for github-repo-picker
- Share SelectionFilterFieldsShape between connection and slot schemas
- Use getUserId(ctx) consistently in connection delete
- Enforce non-empty app_id in CONNECTION_RESOLVE_FOR_USER

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
These DB-backed tests were written against the pre-#3528 PGlite test API
(`../database/test-db`, `createTestSchema`, `seedCommonTestFixtures`),
which main removed when it moved storage tests to real Postgres. After
rebasing, the dead imports broke knip, typecheck, and the unit test job.

Convert all five to the real-Postgres harness, matching the established
087–092 / connection.integration.test.ts pattern:
- Rename DB-backed files to *.integration.test.ts so they route to the
  storage-integration workflow and are excluded from the unit job.
- Use connectTestPgDatabase / resetTestPgDatabase /
  seedCommonTestPgFixtures / closeTestPgDatabase.
- Split slot-resolver: pure SlotResolutionCache/SlotUnresolvedError tests
  stay as a unit *.test.ts; the resolveSlot DB tests move to
  slot-resolver.integration.test.ts.
- 097 migration test now asserts the already-migrated schema's behavior
  (default, CHECK, R4 unique index, aggregation XOR, slot uniqueness).
  Drop the backfill and `down` rollback cases: rolling the shared
  integration DB to 096 / dropping the access column would corrupt the
  schema for sibling integration files. Mirrors 087–092 (up-only).

Verified: 31 tests pass against real Postgres; knip, fmt:check, and tsc
all clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The GitHub import button was gated behind the client-side
experimental_vibecode preference. Render it unconditionally in both the
create-agent dropdown and the agent browser grid, and remove the now-dead
flag definition along with its (only) Experimental settings toggle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tlgimenes and others added 8 commits May 30, 2026 09:58
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… gate

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…/disable toggle

Drop the heavyweight DependencySelectionDialog in favor of a simple
EnableToggle switch on each connection/slot card. Enabled = expose
everything (all-null), disabled = expose nothing (all empty arrays),
with isSelectionEnabled tolerant of legacy subset selections.

Extract connectionAttachTarget so the slot-vs-child decision (org →
concrete child, user-private → typed slot keyed by app_id) is shared
between the agent settings tab and the bulk add-to-agent flow. Slot
cards now mirror the concrete connection layout with the same toggle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…elper

Route the six duplicated oauth-token save flows — connectApp, the add-connection
dialog clone/custom-create paths, the connections page connect, connection
settings re-auth, and GitHub auto-install — through a single best-effort
persistDownstreamToken() helper. Removes ~190 lines of drift-prone duplicated
fallback logic and normalizes GitHub auto-install to the same success-refresh
and raw-token fallback behavior as the other call sites.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tles with email

Extract reusable OAuth identity helpers (JWT decoding, email extraction from token
info, title decoration) into a new oauth-identity module. This consolidates
identity-related logic and enables showing the remote account identity (email) in
connection titles, making multiple instances of the same app distinguishable in
the UI (e.g., "Google Gmail (alice@acme.com)" vs "Google Gmail (bob@acme.com)").

The decoration is best-effort/cosmetic and rides along with the token persistence
update, requiring no extra round-trip. Idempotent across re-auth: existing trailing
parenthesized segments are replaced rather than appended.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…r handler

The auth-error recovery path in the per-tool proxy route called
connections.findById with only (connectionId, organizationId), but the
ConnectionStoragePort signature now requires a third ConnectionViewer
argument. Pass the authenticated viewer to match the other call sites in
this file, fixing the TS2554 typecheck failure.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tlgimenes tlgimenes force-pushed the tlgimenes/list-github-tools branch from e34f650 to c954a64 Compare May 30, 2026 12:58
tlgimenes and others added 16 commits May 31, 2026 10:08
…ents

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verifies that createVirtualClientFrom collects all unresolved slot
app_ids and reports every missing one in a single SlotUnresolvedError,
satisfying the primary contract of the collect-all change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ecute

Verifies that when createVirtualClientFrom throws SlotUnresolvedError
(triggered by a null invokerUserId with a virtualMcp that has typed slots),
the execute generator emits a data-connect-required chunk via writer.write
and yields exactly one { text: '', error: '...connect...app-x...', finishReason: 'stop' }
value before terminating cleanly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove the `unresolvedSlots` parameterized overload from query-keys.ts
  (its only caller `use-unresolved-slots.ts` was deleted in Task 9); only
  the `unresolvedSlotsPrefix()` variant remains and is still used.
- Rewrite the `useSlotAppDisplays` JSDoc to drop references to the now-
  deleted `useUnresolvedSlots` hook and the old upfront gate; describes
  the inline ConnectCard context instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The parent connect-gate catch block emitted only a bare data-connect-required
chunk then `return`ed, producing no `start` and no `finish` chunk. Three
defects resulted:

- The client (thread-connection.ts foldSubStream) only closes the SSE
  sub-stream and flips status streaming→ready on a `finish` chunk. With none,
  the UI hung in "streaming" forever.
- dispatch-run.ts onFinish saw finishReason: undefined, so resolveThreadStatus
  fell through to "failed" — persisting a clean connect-card outcome as failed
  and firing the failure sound.
- Without a `start` chunk the assistant message got id "", duplicating the
  card on SSE reconnect (mergeAndSort keys by id).

Emit a complete UI-message-stream envelope: `start` (stable messageId) → a
short terminal text part → data-connect-required → `finish`. Special-case
resolveThreadStatus so a response carrying a data-connect-required part
resolves to "requires_action" regardless of finishReason. Add unit tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dead gate code

Switch ConnectCardInner from useChatStream() (throwing) to
useOptionalChatStream() so persisted assistant messages render read-only
in the monitoring route (no ActiveTaskProvider). The Retry button is only
shown when a stream is active. Also remove the now-dead
unresolvedSlotsPrefix invalidation and key from use-connect-app /
query-keys, and update stale JSDoc in use-connect-app and connect-slot-row
to reflect the JIT ConnectCard model.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 issues found across 110 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/connect-slot-row.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/connect-slot-row.tsx:50">
P3: `ready` is rendered as "Connecting…" with a spinner, which shows a false in-progress state after a successful connect. Render a completed label for `ready` instead.</violation>
</file>

<file name="apps/mesh/src/storage/derive-app-id.ts">

<violation number="1" location="apps/mesh/src/storage/derive-app-id.ts:80">
P1: `canonicalizeUrl` drops the URL scheme, so `http` and `https` endpoints can collapse to the same derived `app_id`. Include protocol in the canonical form to avoid false service collisions.</violation>

<violation number="2" location="apps/mesh/src/storage/derive-app-id.ts:88">
P2: `npx` package detection is fragile: it can pick flag values instead of the executed positional package.</violation>
</file>

<file name="apps/mesh/src/tools/virtual/assert-concrete-children-org-scoped.ts">

<violation number="1" location="apps/mesh/src/tools/virtual/assert-concrete-children-org-scoped.ts:17">
P3: Connection privacy checks are executed sequentially; run the lookups in parallel to avoid linear DB round-trip latency on create/update.</violation>
</file>

<file name="apps/mesh/migrations/097-connection-access-and-slots.ts">

<violation number="1" location="apps/mesh/migrations/097-connection-access-and-slots.ts:79">
P2: Rollback can fail after slot-based rows are created because `down()` sets `child_connection_id` back to NOT NULL before cleaning up rows where that column is intentionally NULL.</violation>
</file>

<file name="apps/mesh/src/storage/ports.ts">

<violation number="1" location="apps/mesh/src/storage/ports.ts:259">
P1: `findById` allows `organizationId` to be `undefined`, which enables unscoped cross-organization lookups when callers pass optional org context.</violation>
</file>

<file name="apps/mesh/src/api/app.ts">

<violation number="1" location="apps/mesh/src/api/app.ts:256">
P2: Avoid INTERNAL_VIEWER in this request-path lookup; it can pull another user’s private registry connection and use its `project_locator`.

(Based on your team's feedback about restricting INTERNAL_VIEWER to trusted infra and using user/null viewers in user-facing flows.) [FEEDBACK_USED]</violation>
</file>

<file name="apps/mesh/migrations/098-org-scope-connections-and-derive-app-id.ts">

<violation number="1" location="apps/mesh/migrations/098-org-scope-connections-and-derive-app-id.ts:21">
P2: This migration depends on mutable runtime code (`src/storage/derive-app-id`), which makes historical migration results non-deterministic across environments. Snapshot the derivation logic inside the migration (or a versioned migration-local helper) instead.</violation>

<violation number="2" location="apps/mesh/migrations/098-org-scope-connections-and-derive-app-id.ts:56">
P2: Guard the UPDATE with `app_id IS NULL` to avoid clobbering concurrent writes between the read and write steps.</violation>
</file>

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.

Re-trigger cubic

scheme === "https" ? "443" : scheme === "http" ? "80" : "";
const port = u.port && u.port !== defaultPort ? `:${u.port}` : "";
const path = u.pathname.replace(/\/+$/, "");
return `${host}${port}${path}`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: canonicalizeUrl drops the URL scheme, so http and https endpoints can collapse to the same derived app_id. Include protocol in the canonical form to avoid false service collisions.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/storage/derive-app-id.ts, line 80:

<comment>`canonicalizeUrl` drops the URL scheme, so `http` and `https` endpoints can collapse to the same derived `app_id`. Include protocol in the canonical form to avoid false service collisions.</comment>

<file context>
@@ -0,0 +1,99 @@
+    scheme === "https" ? "443" : scheme === "http" ? "80" : "";
+  const port = u.port && u.port !== defaultPort ? `:${u.port}` : "";
+  const path = u.pathname.replace(/\/+$/, "");
+  return `${host}${port}${path}`;
+}
+
</file context>

findById(id: string): Promise<ConnectionEntity | null>;
findById(
id: string,
organizationId: string | undefined,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: findById allows organizationId to be undefined, which enables unscoped cross-organization lookups when callers pass optional org context.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/storage/ports.ts, line 259:

<comment>`findById` allows `organizationId` to be `undefined`, which enables unscoped cross-organization lookups when callers pass optional org context.</comment>

<file context>
@@ -228,18 +228,47 @@ export interface AsyncResearchJobStoragePort {
-  findById(id: string): Promise<ConnectionEntity | null>;
+  findById(
+    id: string,
+    organizationId: string | undefined,
+    viewer: ConnectionViewer,
+  ): Promise<ConnectionEntity | null>;
</file context>

`.execute(db);
await sql`
ALTER TABLE connection_aggregations
ALTER COLUMN child_connection_id SET NOT NULL
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Rollback can fail after slot-based rows are created because down() sets child_connection_id back to NOT NULL before cleaning up rows where that column is intentionally NULL.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/migrations/097-connection-access-and-slots.ts, line 79:

<comment>Rollback can fail after slot-based rows are created because `down()` sets `child_connection_id` back to NOT NULL before cleaning up rows where that column is intentionally NULL.</comment>

<file context>
@@ -0,0 +1,95 @@
+  `.execute(db);
+  await sql`
+    ALTER TABLE connection_aggregations
+      ALTER COLUMN child_connection_id SET NOT NULL
+  `.execute(db);
+  await sql`
</file context>

Comment thread apps/mesh/src/api/app.ts
value: `${DECO_STORE_URL}%`,
},
limit: 1,
viewer: INTERNAL_VIEWER,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Avoid INTERNAL_VIEWER in this request-path lookup; it can pull another user’s private registry connection and use its project_locator.

(Based on your team's feedback about restricting INTERNAL_VIEWER to trusted infra and using user/null viewers in user-facing flows.)

View Feedback

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/api/app.ts, line 256:

<comment>Avoid INTERNAL_VIEWER in this request-path lookup; it can pull another user’s private registry connection and use its `project_locator`.

(Based on your team's feedback about restricting INTERNAL_VIEWER to trusted infra and using user/null viewers in user-facing flows.) </comment>

<file context>
@@ -248,6 +253,7 @@ async function getDecoStoreProjectLocator(
         value: `${DECO_STORE_URL}%`,
       },
       limit: 1,
+      viewer: INTERNAL_VIEWER,
     },
   );
</file context>
Suggested change
viewer: INTERNAL_VIEWER,
viewer: null,

app_id: row.app_id,
});
if (appId) {
await sql`UPDATE connections SET app_id = ${appId} WHERE id = ${row.id}`.execute(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Guard the UPDATE with app_id IS NULL to avoid clobbering concurrent writes between the read and write steps.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/migrations/098-org-scope-connections-and-derive-app-id.ts, line 56:

<comment>Guard the UPDATE with `app_id IS NULL` to avoid clobbering concurrent writes between the read and write steps.</comment>

<file context>
@@ -0,0 +1,71 @@
+      app_id: row.app_id,
+    });
+    if (appId) {
+      await sql`UPDATE connections SET app_id = ${appId} WHERE id = ${row.id}`.execute(
+        db,
+      );
</file context>

*/

import { sql, type Kysely } from "kysely";
import { deriveAppId } from "../src/storage/derive-app-id";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: This migration depends on mutable runtime code (src/storage/derive-app-id), which makes historical migration results non-deterministic across environments. Snapshot the derivation logic inside the migration (or a versioned migration-local helper) instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/migrations/098-org-scope-connections-and-derive-app-id.ts, line 21:

<comment>This migration depends on mutable runtime code (`src/storage/derive-app-id`), which makes historical migration results non-deterministic across environments. Snapshot the derivation logic inside the migration (or a versioned migration-local helper) instead.</comment>

<file context>
@@ -0,0 +1,71 @@
+ */
+
+import { sql, type Kysely } from "kysely";
+import { deriveAppId } from "../src/storage/derive-app-id";
+
+interface ConnRow {
</file context>

const command = (stdio.command ?? "").trim();
const args = (stdio.args ?? []).map((a) => a.trim());
if (command === "npx" || command === "bunx") {
const pkg = args.find((a) => a.length > 0 && !a.startsWith("-"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: npx package detection is fragile: it can pick flag values instead of the executed positional package.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/storage/derive-app-id.ts, line 88:

<comment>`npx` package detection is fragile: it can pick flag values instead of the executed positional package.</comment>

<file context>
@@ -0,0 +1,99 @@
+  const command = (stdio.command ?? "").trim();
+  const args = (stdio.args ?? []).map((a) => a.trim());
+  if (command === "npx" || command === "bunx") {
+    const pkg = args.find((a) => a.length > 0 && !a.startsWith("-"));
+    if (pkg) return `npx:${pkg}`;
+  }
</file context>

disabled={busy}
onClick={() => connect(registryItem)}
>
{status === "connecting" || status === "ready" ? (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: ready is rendered as "Connecting…" with a spinner, which shows a false in-progress state after a successful connect. Render a completed label for ready instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/web/components/chat/connect-slot-row.tsx, line 50:

<comment>`ready` is rendered as "Connecting…" with a spinner, which shows a false in-progress state after a successful connect. Render a completed label for `ready` instead.</comment>

<file context>
@@ -0,0 +1,80 @@
+          disabled={busy}
+          onClick={() => connect(registryItem)}
+        >
+          {status === "connecting" || status === "ready" ? (
+            <>
+              <Loading01 size={12} className="animate-spin" />
</file context>

connectionStorage: ConnectionStoragePort,
organizationId: string,
): Promise<void> {
for (const conn of connections) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Connection privacy checks are executed sequentially; run the lookups in parallel to avoid linear DB round-trip latency on create/update.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/src/tools/virtual/assert-concrete-children-org-scoped.ts, line 17:

<comment>Connection privacy checks are executed sequentially; run the lookups in parallel to avoid linear DB round-trip latency on create/update.</comment>

<file context>
@@ -0,0 +1,29 @@
+  connectionStorage: ConnectionStoragePort,
+  organizationId: string,
+): Promise<void> {
+  for (const conn of connections) {
+    const c = await connectionStorage.findById(
+      conn.connection_id,
</file context>

tlgimenes and others added 4 commits June 1, 2026 12:38
Adds an end-to-end test for the just-in-time connection gate (parent path).
Creates a Virtual MCP agent with a synthetic unresolved typed slot, seeds a
network-free smart tier so the decopilot run reaches tool assembly, posts a
message, and asserts the run's SSE stream carries a data-connect-required part
(with the missing app_id + agent title) and a finish chunk, and that the thread
resolves to requires_action (not failed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nstanceof

The parent connect-gate catch relies on `err instanceof SlotUnresolvedError`.
The throw site imported via a relative path while the catch imported via the
@/ alias; under Bun --hot these can resolve to duplicate module identities,
making instanceof intermittently false (error escapes -> run logged as failed
instead of showing the connect card). Standardize every importer on the @/
alias so all sites share one module instance.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolved conflicts:
- apps/mesh/migrations/index.ts: union of both sides' migrations. Kept
  097-connection-access-and-slots + 098-org-scope-connections-and-derive-app-id
  (this branch) alongside main's 097-drop-local-docker-sandbox-state. Distinct
  names (no key collision), independent concerns; not renumbered since this
  branch's migrations have already applied to existing environments.
- apps/mesh/src/api/routes/oauth-proxy.ts: additive import conflict — kept both
  this branch's INTERNAL_VIEWER import and main's oauth-proxy-metadata helpers +
  retry/RetryError (all used).

Ran bun install to link main's new @decocms/std workspace package.
Verified: bun run check (all workspaces), 55 unit tests, lint 0 errors, fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… was removed)

The connect-card e2e POSTed sandboxProviderKind: "local-docker", which the
local-docker-sandbox drop (merged from main) removed from the schema enum
(now only "cluster" | "user-desktop"), so CI failed the message POST with
400 invalid_value before reaching the gate under test.

Pin "cluster" instead: it's schema-valid and resolveDispatchTarget routes it
to the in-cluster (loopback) path that needs no user-desktop link daemon — the
same no-link path CI uses — so the decopilot run reaches assembleDecopilotTools
and the parent connect-gate fires. Omitting the field is not safe: the default
can resolve to "user-desktop" and 409.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant