Conversation
There was a problem hiding this comment.
19 issues found across 13 files
Tip: instead of fixing issues one by one fix them all with cubic
Re-trigger cubic
There was a problem hiding this comment.
5 issues found across 12 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/db/prisma/schema.prisma">
<violation number="1" location="packages/db/prisma/schema.prisma:77">
P1: Schema change is missing a matching Prisma migration for the new `Conversation.budget_reset_at` column. Deployments using migrations will keep an older table shape and break `Conversation` reads/writes.</violation>
</file>
Tip: instead of fixing issues one by one fix them all with cubic
Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
|
@greptile-apps Is every thing fine and the pr is safe to merge ? |
|
No, this PR is not safe to merge yet. There are several issues that need to be resolved first, ranging from a critical security vulnerability to correctness bugs: 🔴 Blocker — Must fix before merge1. Unauthenticated approve/reject endpoints (
2. No mutual presence check for A request with neither 3. No LLM fetch timeout ( There is no timeout on the Gemini API call. A slow or hung upstream response will stall the agent loop indefinitely with no way to recover. 🟡 Should fix — Correctness issues4. When a tool hits PENDING, the 5. No status double-check on resume ( When resuming with an 🟢 Lower priority — Clean up before or after merge6. 7. Hardcoded 8. Out-of-order import ( Summary: Items 1–3 are hard blockers. Item 1 in particular is a fundamental security flaw — the human-approval gate is the core value proposition of this feature, and it can be bypassed trivially without auth on the approve/reject endpoints. Tip: You can customize Greptile's behavior for this repo with |
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
Summary by cubic
Adds a policy-gated Gemini agent that plans and runs MCP tools with optional human approval. Ships
/agent/runwith strict validation, a Gemini client with request timeouts, and guardrails like a 30-iteration cap and 3-minute budget auto-reset.New Features
tool_call/final_answer, MCP discovery/exec, schema checks, token tracking, 30-iteration cap; resume enforces approvals (APPROVED runs, PENDING returns PENDING, others DENY).AbortSignal.timeoutwithGEMINI_TIMEOUT_MSoverride; tests cover abort-on-timeout and invalid-timeout fallback.POST /agent/run: validates inputs, requires message or approvalId, role allowlist for history (cap 100); returns updated history.npm run clifor chat and approve/reject; per-session UUIDconversationId; tool execution trace.Migration
GEMINI_API_KEYto.env(auto-loaded viaapps/api/src/utils/env); optionally setGEMINI_TIMEOUT_MS(default 30000).budget_reset_attoConversation. Configure MCP tools.Written for commit e2c1ab7. Summary will update on new commits.
Greptile Summary
This PR introduces a policy-gated Gemini agent that plans and executes MCP tools with optional human approval, wiring it up via a new
POST /agent/runendpoint and supportingapprove/rejectendpoints on the policy router.loop.ts): 30-iteration cap, 3-minute budget auto-reset viabudget_reset_at, approval resume path that checks DB status before reconstructing the tool call step, and accumulated-token tracking flushed to the DB on every exit path.llm.ts): Geminigemini-2.5-flashcall with configurableAbortSignal.timeout, JSON schema validation of tool arguments, and clear error types for malformed responses.index.ts,router.ts):/agent/runvalidatesmessage,conversationId,approvalId(type + non-empty), andhistory(role allowlist, 100-item cap); approve/reject endpoints use atomicupdateManywithstatus: PENDINGcondition to prevent double-transitions.Confidence Score: 3/5
The core agent loop is not safe to merge without fixing the approval record lifecycle: an executor failure after approval leaves the user with no recovery path.
The approval resume path in loop.ts delegates the delete of the approval record to decide() (decision.ts), which removes it atomically before mcpExecutor.execute() is called. A transient MCP failure after the record is deleted leaves the client with a 500 response, no approvalId in the body, and no way to retry — the user must restart the full request and go through human approval again. This is a real, exercisable failure mode on the critical happy path of the new feature.
apps/api/src/agent/loop.ts — the ALLOW branch (lines 176–193) needs attention around executor failure handling after the approval record is deleted by decide().
Important Files Changed
Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant Client participant API as POST /agent/run participant AgentLoop as runAgent participant LLM as Gemini LLM participant Policy as decide participant DB as Database participant MCP as mcpExecutor Client->>API: message + conversationId + history API->>AgentLoop: runAgent(message, conversationId) AgentLoop->>DB: upsert conversation (budget check/reset) AgentLoop->>MCP: discoverTools AgentLoop->>LLM: nextStep(memory, tools) LLM-->>AgentLoop: tool_call response AgentLoop->>Policy: decide(tool_name, args, conversationId) alt PENDING Policy->>DB: approval.create PENDING Policy-->>AgentLoop: PENDING + approvalId AgentLoop-->>Client: status PENDING + approvalId Client->>API: POST /policies/approvals/:id/approve API->>DB: updateMany PENDING to APPROVED Client->>API: resume with approvalId API->>AgentLoop: runAgent(null, conversationId, approvalId) AgentLoop->>DB: approval.findUnique check APPROVED AgentLoop->>Policy: decide with approvalId Policy->>DB: approval.delete Policy-->>AgentLoop: ALLOW AgentLoop->>MCP: execute tool MCP-->>AgentLoop: result AgentLoop->>LLM: nextStep with tool result LLM-->>AgentLoop: final_answer AgentLoop-->>Client: status SUCCESS + answer else ALLOW Policy-->>AgentLoop: ALLOW AgentLoop->>MCP: execute tool MCP-->>AgentLoop: result AgentLoop->>LLM: nextStep continues else DENY Policy-->>AgentLoop: DENY AgentLoop-->>Client: status DENY + reason end%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant Client participant API as POST /agent/run participant AgentLoop as runAgent participant LLM as Gemini LLM participant Policy as decide participant DB as Database participant MCP as mcpExecutor Client->>API: message + conversationId + history API->>AgentLoop: runAgent(message, conversationId) AgentLoop->>DB: upsert conversation (budget check/reset) AgentLoop->>MCP: discoverTools AgentLoop->>LLM: nextStep(memory, tools) LLM-->>AgentLoop: tool_call response AgentLoop->>Policy: decide(tool_name, args, conversationId) alt PENDING Policy->>DB: approval.create PENDING Policy-->>AgentLoop: PENDING + approvalId AgentLoop-->>Client: status PENDING + approvalId Client->>API: POST /policies/approvals/:id/approve API->>DB: updateMany PENDING to APPROVED Client->>API: resume with approvalId API->>AgentLoop: runAgent(null, conversationId, approvalId) AgentLoop->>DB: approval.findUnique check APPROVED AgentLoop->>Policy: decide with approvalId Policy->>DB: approval.delete Policy-->>AgentLoop: ALLOW AgentLoop->>MCP: execute tool MCP-->>AgentLoop: result AgentLoop->>LLM: nextStep with tool result LLM-->>AgentLoop: final_answer AgentLoop-->>Client: status SUCCESS + answer else ALLOW Policy-->>AgentLoop: ALLOW AgentLoop->>MCP: execute tool MCP-->>AgentLoop: result AgentLoop->>LLM: nextStep continues else DENY Policy-->>AgentLoop: DENY AgentLoop-->>Client: status DENY + reason endComments Outside Diff (4)
apps/api/src/policy/router.ts, line 1-2 (link)import { ApprovalStatus }declaration appears mid-file (after all route handler code) rather than at the top. While JS/TS hoisting means this is syntactically valid, it is non-standard and can confuse static analysis tools and readers alike.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
packages/db/prisma/dev.db, line 1 (link)packages/db/prisma/dev.dbis a binary SQLite file that should not be tracked in git. Committing it bloats the repository history on every schema change and could expose local test data to anyone cloning the repo. Add*.db(or specificallyprisma/dev.db) to.gitignoreand rungit rm --cached packages/db/prisma/dev.dbto stop tracking it.apps/api/src/agent/cli.ts, line 374 (link)conversationIdshares state across all CLI sessionsconversationIdis always"cli-conversation-session". Every CLI invocation upserts or reuses the same conversation row in the database, meaning separate test runs share token budgets and conversation history. If two CLI instances run simultaneously they will also interleave their messages into the same conversation. Consider generating a random UUID per session (or accepting it as a CLI argument) to give each session an independent context.apps/api/src/policy/router.ts, line 1352-1375 (link)POST /policies/approvals/:id/approveandPOST /policies/approvals/:id/rejecthave no authentication or authorization middleware. Any unauthenticated caller that can reach the API can approve an arbitrary pending tool execution, entirely bypassing the human-in-the-loop policy gate that is the core purpose of this feature. An attacker or misconfigured client just needs to know (or guess) a validapprovalId— which is a UUID, but IDs are often observable from prior API responses.Reviews (5): Last reviewed commit: "fix(gemimi-timeout) : fixed timout issue" | Re-trigger Greptile