diff --git a/README.md b/README.md index 9ac7b29..98b6002 100644 --- a/README.md +++ b/README.md @@ -16,11 +16,14 @@ This Turborepo includes the following packages/apps: ### Apps and Packages -- `docs`: a [Next.js](https://nextjs.org/) app -- `web`: another [Next.js](https://nextjs.org/) app -- `@repo/ui`: a stub React component library shared by both `web` and `docs` applications -- `@repo/eslint-config`: `eslint` configurations (includes `eslint-config-next` and `eslint-config-prettier`) -- `@repo/typescript-config`: `tsconfig.json`s used throughout the monorepo +- `apps/api`: Express API server managing MCP server registrations, tool executions, logging, and policy evaluations. +- `apps/web`: Next.js web application interface. +- `apps/file-manager-mcp`: Custom Model Context Protocol (MCP) server providing secure sandboxed filesystem operations. +- `packages/db` ([@repo/db](./packages/db/README.md)): Prisma database library managing schemas, migrations, and SQLite instance. (See the [Schema Architecture Documentation](./packages/db/README.md)). +- `packages/shared` ([@repo/shared](./packages/shared)): Shared utility functions and formatting helpers. +- `packages/ui` ([@repo/ui](./packages/ui)): React UI components shared by the applications. +- `packages/eslint-config` ([@repo/eslint-config](./packages/eslint-config)): Monorepo ESLint configurations. +- `packages/typescript-config` ([@repo/typescript-config](./packages/typescript-config)): TSConfig setups. Each package/app is 100% [TypeScript](https://www.typescriptlang.org/). diff --git a/apps/api/mcp/errors.md b/apps/api/mcp/errors.md index 4c10ef4..42ba114 100644 --- a/apps/api/mcp/errors.md +++ b/apps/api/mcp/errors.md @@ -7,17 +7,18 @@ This document explains the architecture and implementation of error handling, st The gateway implements a custom error class, `AppError` (defined in [types.ts](../types.ts)), which extends the native `Error` class and introduces a `statusCode` field. This model enables: + 1. **Decoupled Error Classification**: The core logic (e.g., input validation, tool execution, and registry lookup) determines the semantic meaning of a failure and assigns the appropriate HTTP status code at the throw site. 2. **Safe Route Mapping**: The Express routing layer does not parse message strings or substrings. Instead, it inspects whether the caught error is an instance of `AppError` and maps it directly using `error.statusCode`. ### Status Code Mapping Table -| Error Scenario | HTTP Status Code | Thrown From | Exception Class | -|---|---|---|---| -| Invalid toolName type | 400 Bad Request | `ToolExecutor.execute` | `AppError` | -| Empty toolName value | 400 Bad Request | `ToolExecutor.execute` | `AppError` | -| Policy decision is not ALLOW | 403 Forbidden | `ToolExecutor.execute` | `AppError` | -| Requested tool is not registered | 404 Not Found | `ToolExecutor.execute` | `AppError` | +| Error Scenario | HTTP Status Code | Thrown From | Exception Class | +| -------------------------------- | ------------------------- | ---------------------------- | ---------------- | +| Invalid toolName type | 400 Bad Request | `ToolExecutor.execute` | `AppError` | +| Empty toolName value | 400 Bad Request | `ToolExecutor.execute` | `AppError` | +| Policy decision is not ALLOW | 403 Forbidden | `ToolExecutor.execute` | `AppError` | +| Requested tool is not registered | 404 Not Found | `ToolExecutor.execute` | `AppError` | | Internal service crash / timeout | 500 Internal Server Error | Subprocess spawn / transport | Standard `Error` | --- @@ -25,16 +26,21 @@ This model enables: ## Security Mitigations ### 1. Substring Collision Prevention + Using `instanceof AppError` and `error.statusCode` prevents routing bugs where user-supplied parameters (like a tool name consisting of the substring `"must be a"` or a decision value containing `"Tool not found"`) would accidentally match Express error handling filters. ### 2. Information Leakage Prevention (CWE-209) + Any exception that is not a subclass of `AppError` (such as a database connection pool timeout, subprocess exit code error, or network failure) is treated as an internal error. The Express route handler intercepts these, logs the raw error to `stderr` for internal auditing, and returns a generic response payload to the client: + ```json { "error": "Failed to execute tool" } ``` + This masks server internals and prevents stack trace details from leaking to external clients. ### 3. Execution Timeout and Resource Cleanups + If a tool execution takes longer than the configured timeout, the request is aborted using `Promise.race()` and a standard Timeout `Error` is thrown, which naturally maps to a `500` response. The pending timer is cleaned up in a `finally` block to prevent timer leakages. diff --git a/apps/api/mcp/execute.ts b/apps/api/mcp/execute.ts index e6cb20c..e5695c4 100644 --- a/apps/api/mcp/execute.ts +++ b/apps/api/mcp/execute.ts @@ -46,7 +46,10 @@ export class ToolExecutor { } if (decision !== "ALLOW") { - const error = new AppError(403, `Tool execution rejected with decision: ${decision}`); + const error = new AppError( + 403, + `Tool execution rejected with decision: ${decision}`, + ); logger.error("Tool execution failed: Denied by policy", { tool_name: toolName, decision, diff --git a/apps/api/mcp/mcp.test.ts b/apps/api/mcp/mcp.test.ts index 8fa9462..703edc2 100644 --- a/apps/api/mcp/mcp.test.ts +++ b/apps/api/mcp/mcp.test.ts @@ -9,7 +9,6 @@ import { fileURLToPath } from "url"; import { StdioMCPServer } from "./stdio-server.js"; import { logger } from "./logger.js"; - const __filename = fileURLToPath(import.meta.url); const __dirname = path.dirname(__filename); @@ -277,7 +276,6 @@ describe("MCP Production-Ready Module", () => { }); }); - describe("ToolExecutor Execution & Safeness", () => { it("should execute a registered tool successfully (Happy Path)", async () => { const mockTool: Tool = { @@ -318,10 +316,12 @@ describe("MCP Production-Ready Module", () => { "mathAdd", { a: 2, b: 3 }, { decision: "DENY", conversationId: "convDenied" }, - ) + ), ).rejects.toThrow("Tool execution rejected with decision: DENY"); - const errLog = loggedItems.find((log) => log.level === "error" && log.conversation_id === "convDenied"); + const errLog = loggedItems.find( + (log) => log.level === "error" && log.conversation_id === "convDenied", + ); expect(errLog).toBeDefined(); expect(errLog.message).toBe("Tool execution failed: Denied by policy"); expect(errLog.decision).toBe("DENY"); @@ -506,9 +506,21 @@ describe("MCP Production-Ready Module", () => { describe("Logger Metadata Protection", () => { it("should prevent overriding core properties via meta argument", () => { - logger.info("Main message", { level: "hacked", message: "spoofed message", extra: "valid" }); - logger.warn("Main message", { level: "hacked", message: "spoofed message", extra: "valid" }); - logger.error("Main message", { level: "hacked", message: "spoofed message", extra: "valid" }); + logger.info("Main message", { + level: "hacked", + message: "spoofed message", + extra: "valid", + }); + logger.warn("Main message", { + level: "hacked", + message: "spoofed message", + extra: "valid", + }); + logger.error("Main message", { + level: "hacked", + message: "spoofed message", + extra: "valid", + }); const infoLogs = loggedItems.filter((log) => log.extra === "valid"); expect(infoLogs.length).toBe(3); @@ -524,4 +536,3 @@ describe("MCP Production-Ready Module", () => { }); }); }); - diff --git a/apps/api/src/index.ts b/apps/api/src/index.ts index 047d1c1..186bd6e 100644 --- a/apps/api/src/index.ts +++ b/apps/api/src/index.ts @@ -4,6 +4,7 @@ import dotenv from "dotenv"; import { formatDate } from "@repo/shared"; import { mcpDiscovery, mcpExecutor } from "../mcp/bootstrap.js"; import { AppError } from "../types.js"; +import policiesRouter from "./policy/router.js"; dotenv.config(); @@ -12,6 +13,7 @@ const port = process.env.PORT || 3001; app.use(cors()); app.use(express.json({ limit: "1mb" })); +app.use(policiesRouter); // Health check endpoint app.get("/health", (req, res) => { diff --git a/apps/api/src/policy/decision.ts b/apps/api/src/policy/decision.ts new file mode 100644 index 0000000..b22b282 --- /dev/null +++ b/apps/api/src/policy/decision.ts @@ -0,0 +1,108 @@ +import { db, ApprovalStatus } from "@repo/db"; +import PolicyEngine from "./engine.js"; +import type { ApprovalRequest, ConversationRequest } from "../../types.js"; + +export type Decision = "ALLOW" | "DENY" | "PENDING"; + +export interface DecisionResult { + decision: Decision; + reason?: string; +} + +export async function decide( + context: ApprovalRequest, + conversation: ConversationRequest, +): Promise { + try { + // Step 1: Call PolicyEngine + const policy = await PolicyEngine(context, conversation); + + // Step 2: Policy denied and does not require approval + if (!policy.allowed && !policy.requiresApproval) { + return { + decision: "DENY", + reason: policy.reason, + }; + } + + // Step 3: Policy requires human approval + if (policy.requiresApproval) { + if (!context.approvalId) { + const created = await db.approval.create({ + data: { + tool_name: context.tool_name, + arguments: context.arguments as any, + status: ApprovalStatus.PENDING, + }, + }); + + return { + decision: "PENDING", + reason: created.id, + }; + } + + const approval = await db.approval.findUnique({ + where: { + id: context.approvalId, + }, + }); + + if (!approval) { + return { + decision: "DENY", + reason: "Approval not found", + }; + } + + if (approval.tool_name !== context.tool_name) { + return { + decision: "DENY", + reason: "Approval tool name mismatch", + }; + } + + switch (approval.status) { + case ApprovalStatus.APPROVED: + await db.approval.delete({ + where: { id: approval.id }, + }); + return { + decision: "ALLOW", + }; + case ApprovalStatus.PENDING: + return { + decision: "PENDING", + }; + case ApprovalStatus.REJECTED: + return { + decision: "DENY", + reason: "Approval rejected", + }; + default: + return { + decision: "DENY", + reason: "Unrecognized approval status", + }; + } + } + + // Step 4: Policy allowed + if (policy.allowed) { + return { + decision: "ALLOW", + }; + } + + // Fallback/Safety return + return { + decision: "DENY", + reason: "Unrecognized policy state", + }; + } catch (error) { + return { + decision: "DENY", + reason: "Decision engine failure", + }; + } +} diff --git a/apps/api/src/policy/engine.ts b/apps/api/src/policy/engine.ts new file mode 100644 index 0000000..4e5f712 --- /dev/null +++ b/apps/api/src/policy/engine.ts @@ -0,0 +1,107 @@ +import needsApproval from "./rules/approval.js"; +import isblocked from "./rules/block.js"; +import budgetExceeded from "./rules/budget.js"; +import type { ApprovalRequest, ConversationRequest } from "../../types.js"; +import { db } from "@repo/db"; +import { logger } from "../../mcp/logger.js"; + +export interface PolicyEngineResult { + allowed: boolean; + requiresApproval: boolean; + reason?: string; +} + +export default async function PolicyEngine( + context: ApprovalRequest, + conversation: ConversationRequest, +): Promise { + let policy; + try { + const tool_name = context.tool_name; + + // Fetch the policy record once to prevent double lookup and TOCTOU race conditions + policy = await db.policy.findUnique({ + where: { tool_name }, + }); + } catch (error: any) { + logger.error("Failed to query policy table in PolicyEngine pre-fetch", { + tool_name: context.tool_name, + error_message: error.message || String(error), + }); + return { + allowed: false, + requiresApproval: false, + reason: "Failed to query policy table", + }; + } + + try { + const tool_name = context.tool_name; + + // 1. Block Check + const blockedResult = await isblocked(tool_name, policy); + if (!blockedResult.success) { + return { + allowed: false, + requiresApproval: false, + reason: blockedResult.reason, + }; + } + if (blockedResult.result) { + return { + allowed: false, + requiresApproval: false, + reason: blockedResult.reason, + }; + } + + // 2. Budget Check + const budgetResult = await budgetExceeded( + conversation.conversationId, + conversation.token, + ); + if (!budgetResult.success) { + return { + allowed: false, + requiresApproval: false, + reason: budgetResult.reason, + }; + } + if (budgetResult.result) { + return { + allowed: false, + requiresApproval: false, + reason: budgetResult.reason, + }; + } + + // 3. Approval Check + const approvalResult = await needsApproval(tool_name, policy); + if (!approvalResult.success) { + return { + allowed: false, + requiresApproval: false, + reason: approvalResult.reason, + }; + } + if (approvalResult.result) { + return { + allowed: false, + requiresApproval: true, + reason: approvalResult.reason, + }; + } + + // 4. Default Success (Allowed) + return { + allowed: true, + requiresApproval: false, + }; + } catch (error: any) { + return { + allowed: false, + requiresApproval: false, + reason: error instanceof Error ? error.message : String(error), + }; + } +} diff --git a/apps/api/src/policy/policy.test.ts b/apps/api/src/policy/policy.test.ts new file mode 100644 index 0000000..3f39832 --- /dev/null +++ b/apps/api/src/policy/policy.test.ts @@ -0,0 +1,452 @@ +import { vi, describe, it, expect, beforeEach } from "vitest"; +import { Request, Response } from "express"; + +// 1. Mock @repo/db before imports to prevent real DB queries +vi.mock("@repo/db", () => { + const PolicyAction = { + ALLOW: "ALLOW", + APPROVAL: "APPROVAL", + DENY: "DENY", + }; + const ApprovalStatus = { + PENDING: "PENDING", + APPROVED: "APPROVED", + REJECTED: "REJECTED", + }; + return { + PolicyAction, + ApprovalStatus, + db: { + policy: { + findUnique: vi.fn(), + findMany: vi.fn(), + create: vi.fn(), + update: vi.fn(), + delete: vi.fn(), + }, + conversation: { + findUnique: vi.fn(), + }, + approval: { + findUnique: vi.fn(), + findFirst: vi.fn(), + create: vi.fn(), + delete: vi.fn(), + }, + }, + }; +}); + +// Import db and PolicyAction/ApprovalStatus from the mocked package +import { db, PolicyAction, ApprovalStatus } from "@repo/db"; + +// Import modules to test +import isblocked from "./rules/block.js"; +import budgetExceeded from "./rules/budget.js"; +import needsApproval from "./rules/approval.js"; +import PolicyEngine from "./engine.js"; +import { decide } from "./decision.js"; +import policiesRouter from "./router.js"; + +// Helper to construct Express Response mock +function mockResponse() { + const res: any = {}; + res.status = vi.fn().mockReturnValue(res); + res.json = vi.fn().mockReturnValue(res); + res.end = vi.fn().mockReturnValue(res); + return res as Response; +} + +// Extract express handler helpers +const getHandler = (path: string, method: string) => { + const layer = (policiesRouter as any).stack.find( + (l: any) => + l.route?.path === path && l.route?.methods?.[method.toLowerCase()], + ); + return layer?.route?.stack[0]?.handle; +}; + +describe("Policy Engine Rules & Orchestrator", () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + describe("Rule: isblocked", () => { + it("should return blocked=true if PolicyAction is DENY", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.DENY, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await isblocked("test_tool"); + expect(res.success).toBe(true); + expect(res.result).toBe(true); + expect(res.reason).toBe("Forbidden policy"); + }); + + it("should return blocked=false if PolicyAction is not DENY", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.ALLOW, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await isblocked("test_tool"); + expect(res.success).toBe(true); + expect(res.result).toBe(false); + }); + + it("should fail gracefully and return success:false on db error", async () => { + vi.mocked(db.policy.findUnique).mockRejectedValue(new Error("DB error")); + const res = await isblocked("test_tool"); + expect(res.success).toBe(false); + expect(res.result).toBe(false); + expect(res.reason).toBe("Failed to query policy table"); + }); + }); + + describe("Rule: budgetExceeded", () => { + it("should return exceeded=false when total tokens are within budget", async () => { + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 100, + budget_limit: 1000, + createdAt: new Date(), + }); + + const res = await budgetExceeded("conv-1", 50); + expect(res.success).toBe(true); + expect(res.result).toBe(false); + }); + + it("should return exceeded=true when total tokens exceed budget", async () => { + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 950, + budget_limit: 1000, + createdAt: new Date(), + }); + + const res = await budgetExceeded("conv-1", 100); + expect(res.success).toBe(true); + expect(res.result).toBe(true); + expect(res.reason).toBe("Token budget exceeded"); + }); + + it("should return success:false if conversation is not found", async () => { + vi.mocked(db.conversation.findUnique).mockResolvedValue(null); + const res = await budgetExceeded("conv-missing", 10); + expect(res.success).toBe(false); + expect(res.result).toBe(false); + expect(res.reason).toBe("Conversation conv-missing not found"); + }); + + it("should return success:false if conversationId is missing or unknown", async () => { + const res = await budgetExceeded("unknown", 10); + expect(res.success).toBe(false); + expect(res.result).toBe(false); + expect(res.reason).toBe("Conversation context is missing or unknown"); + }); + }); + + describe("Rule: needsApproval", () => { + it("should return result=true if PolicyAction is APPROVAL", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.APPROVAL, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await needsApproval("test_tool"); + expect(res.success).toBe(true); + expect(res.result).toBe(true); + }); + + it("should return result=false if PolicyAction is not APPROVAL", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.ALLOW, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await needsApproval("test_tool"); + expect(res.success).toBe(true); + expect(res.result).toBe(false); + }); + }); + + describe("PolicyEngine Orchestrator", () => { + it("should allow tool execution when all checks pass", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.ALLOW, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 10, + budget_limit: 100, + createdAt: new Date(), + }); + + const res = await PolicyEngine( + { tool_name: "test_tool", arguments: {} }, + { conversationId: "conv-1", token: 10 }, + ); + + expect(res.allowed).toBe(true); + expect(res.requiresApproval).toBe(false); + }); + + it("should fail closed and return allowed:false if block check fails", async () => { + vi.mocked(db.policy.findUnique).mockRejectedValue(new Error("DB error")); + + const res = await PolicyEngine( + { tool_name: "test_tool", arguments: {} }, + { conversationId: "conv-1", token: 10 }, + ); + + expect(res.allowed).toBe(false); + expect(res.requiresApproval).toBe(false); + expect(res.reason).toBe("Failed to query policy table"); + }); + }); +}); + +describe("Decision Orchestration (decide)", () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + it("should return ALLOW when policy is allowed", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.ALLOW, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 10, + budget_limit: 100, + createdAt: new Date(), + }); + + const res = await decide( + { tool_name: "test_tool", arguments: {} }, + { conversationId: "conv-1", token: 5 }, + ); + + expect(res.decision).toBe("ALLOW"); + }); + + it("should create a new pending approval and return PENDING with generated ID if requiresApproval:true and no approvalId", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.APPROVAL, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 10, + budget_limit: 100, + createdAt: new Date(), + }); + vi.mocked(db.approval.create).mockResolvedValue({ + id: "generated-app-id", + tool_name: "test_tool", + arguments: {}, + status: ApprovalStatus.PENDING, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await decide( + { tool_name: "test_tool", arguments: {} }, + { conversationId: "conv-1", token: 5 }, + ); + + expect(res.decision).toBe("PENDING"); + expect(res.reason).toBe("generated-app-id"); + expect(db.approval.create).toHaveBeenCalledWith({ + data: { + tool_name: "test_tool", + arguments: {}, + status: ApprovalStatus.PENDING, + }, + }); + }); + + it("should fetch approval state and return ALLOW if status is APPROVED", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "test_tool", + action: PolicyAction.APPROVAL, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 10, + budget_limit: 100, + createdAt: new Date(), + }); + vi.mocked(db.approval.findUnique).mockResolvedValue({ + id: "app-id-123", + tool_name: "test_tool", + arguments: {}, + status: ApprovalStatus.APPROVED, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await decide( + { tool_name: "test_tool", arguments: {}, approvalId: "app-id-123" }, + { conversationId: "conv-1", token: 5 }, + ); + + expect(res.decision).toBe("ALLOW"); + expect(db.approval.delete).toHaveBeenCalledWith({ + where: { id: "app-id-123" }, + }); + }); + + it("should return DENY when retrieved approval tool_name does not match the requesting tool_name", async () => { + vi.mocked(db.policy.findUnique).mockResolvedValue({ + id: "1", + tool_name: "high_risk_tool", + action: PolicyAction.APPROVAL, + sandbox_path: null, + createdAt: new Date(), + updatedAt: new Date(), + }); + vi.mocked(db.conversation.findUnique).mockResolvedValue({ + id: "conv-1", + tokens_used: 10, + budget_limit: 100, + createdAt: new Date(), + }); + vi.mocked(db.approval.findUnique).mockResolvedValue({ + id: "app-id-123", + tool_name: "low_risk_tool", + arguments: {}, + status: ApprovalStatus.APPROVED, + createdAt: new Date(), + updatedAt: new Date(), + }); + + const res = await decide( + { tool_name: "high_risk_tool", arguments: {}, approvalId: "app-id-123" }, + { conversationId: "conv-1", token: 5 }, + ); + + expect(res.decision).toBe("DENY"); + expect(res.reason).toBe("Approval tool name mismatch"); + }); +}); + +describe("Policy Engine REST Endpoints", () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + describe("GET /policies", () => { + it("should return list of stored policies", async () => { + const getPolicies = getHandler("/policies", "GET"); + expect(getPolicies).toBeDefined(); + + vi.mocked(db.policy.findMany).mockResolvedValue([ + { tool_name: "tool1", action: PolicyAction.ALLOW }, + ] as any); + + const req = {} as Request; + const res = mockResponse(); + + await getPolicies(req, res, () => {}); + + expect(res.json).toHaveBeenCalledWith([ + { tool_name: "tool1", action: PolicyAction.ALLOW }, + ]); + }); + }); + + describe("GET /policies/:toolName", () => { + it("should return the stored policy if it exists", async () => { + const getSinglePolicy = getHandler("/policies/:toolName", "GET"); + expect(getSinglePolicy).toBeDefined(); + + vi.mocked(db.policy.findUnique).mockResolvedValue({ + tool_name: "tool1", + action: PolicyAction.DENY, + } as any); + + const req = { params: { toolName: "tool1" } } as any as Request; + const res = mockResponse(); + + await getSinglePolicy(req, res, () => {}); + + expect(res.json).toHaveBeenCalledWith({ + tool_name: "tool1", + action: PolicyAction.DENY, + }); + }); + + it("should return implicit APPROVAL if it does not exist in DB", async () => { + const getSinglePolicy = getHandler("/policies/:toolName", "GET"); + vi.mocked(db.policy.findUnique).mockResolvedValue(null); + + const req = { params: { toolName: "unknown_tool" } } as any as Request; + const res = mockResponse(); + + await getSinglePolicy(req, res, () => {}); + + expect(res.json).toHaveBeenCalledWith({ + tool_name: "unknown_tool", + action: "APPROVAL", + implicit: true, + }); + }); + }); + + describe("POST /policies", () => { + it("should return 409 if policy already exists", async () => { + const postPolicy = getHandler("/policies", "POST"); + vi.mocked(db.policy.findUnique).mockResolvedValue({ + tool_name: "tool1", + action: PolicyAction.ALLOW, + } as any); + + const req = { + body: { tool_name: "tool1", action: "DENY" }, + } as any as Request; + const res = mockResponse(); + + await postPolicy(req, res, () => {}); + + expect(res.status).toHaveBeenCalledWith(409); + expect(res.json).toHaveBeenCalledWith({ error: "Policy already exists" }); + }); + }); +}); diff --git a/apps/api/src/policy/router.ts b/apps/api/src/policy/router.ts new file mode 100644 index 0000000..a0b225c --- /dev/null +++ b/apps/api/src/policy/router.ts @@ -0,0 +1,189 @@ +import { Router, Request, Response } from "express"; +import { db, PolicyAction } from "@repo/db"; + +const router = Router(); + +// GET /policies +router.get("/policies", async (req: Request, res: Response): Promise => { + try { + const policies = await db.policy.findMany({ + select: { + tool_name: true, + action: true, + }, + }); + res.json(policies); + } catch (error) { + res.status(500).json({ error: "Internal server error" }); + } +}); + +// GET /policies/:toolName +router.get( + "/policies/:toolName", + async (req: Request, res: Response): Promise => { + const { toolName } = req.params; + if (!toolName || !toolName.trim()) { + res.status(400).json({ error: "Missing or invalid toolName parameter" }); + return; + } + const normalizedToolName = toolName.trim(); + try { + const policy = await db.policy.findUnique({ + where: { tool_name: normalizedToolName }, + select: { + tool_name: true, + action: true, + }, + }); + + if (!policy) { + res.json({ + tool_name: normalizedToolName, + action: "APPROVAL", + implicit: true, + }); + return; + } + + res.json(policy); + } catch (error) { + res.status(500).json({ error: "Internal server error" }); + } + }, +); + +// POST /policies +router.post("/policies", async (req: Request, res: Response): Promise => { + const { tool_name, action } = req.body; + + if (!tool_name || typeof tool_name !== "string" || !tool_name.trim()) { + res.status(400).json({ error: "Missing or invalid tool_name" }); + return; + } + + const normalizedToolName = tool_name.trim(); + + if ( + !action || + !Object.values(PolicyAction).includes(action as PolicyAction) + ) { + res.status(400).json({ + error: "Invalid action. Accepted values are ALLOW, APPROVAL, DENY", + }); + return; + } + + try { + const existing = await db.policy.findUnique({ + where: { tool_name: normalizedToolName }, + }); + + if (existing) { + res.status(409).json({ error: "Policy already exists" }); + return; + } + + const created = await db.policy.create({ + data: { + tool_name: normalizedToolName, + action: action as PolicyAction, + }, + select: { + tool_name: true, + action: true, + }, + }); + + res.status(201).json(created); + } catch (error: any) { + if (error.code === "P2002") { + res.status(409).json({ error: "Policy already exists" }); + return; + } + res.status(500).json({ error: "Internal server error" }); + } +}); + +// PATCH /policies/:toolName +router.patch( + "/policies/:toolName", + async (req: Request, res: Response): Promise => { + const { toolName } = req.params; + if (!toolName || !toolName.trim()) { + res.status(400).json({ error: "Missing or invalid toolName parameter" }); + return; + } + const { action } = req.body; + const normalizedToolName = toolName.trim(); + + if ( + !action || + !Object.values(PolicyAction).includes(action as PolicyAction) + ) { + res.status(400).json({ + error: "Invalid action. Accepted values are ALLOW, APPROVAL, DENY", + }); + return; + } + + try { + const existing = await db.policy.findUnique({ + where: { tool_name: normalizedToolName }, + }); + + if (!existing) { + res.status(404).json({ error: "Policy not found" }); + return; + } + + const updated = await db.policy.update({ + where: { tool_name: normalizedToolName }, + data: { + action: action as PolicyAction, + }, + select: { + tool_name: true, + action: true, + }, + }); + + res.json(updated); + } catch (error) { + res.status(500).json({ error: "Internal server error" }); + } + }, +); + +// DELETE /policies/:toolName +router.delete( + "/policies/:toolName", + async (req: Request, res: Response): Promise => { + const { toolName } = req.params; + if (!toolName || !toolName.trim()) { + res.status(400).json({ error: "Missing or invalid toolName parameter" }); + return; + } + const normalizedToolName = toolName.trim(); + try { + const existing = await db.policy.findUnique({ + where: { tool_name: normalizedToolName }, + }); + + if (!existing) { + res.status(404).json({ error: "Policy not found" }); + return; + } + + await db.policy.delete({ + where: { tool_name: normalizedToolName }, + }); + + res.status(204).end(); + } catch (error) { + res.status(500).json({ error: "Internal server error" }); + } + }, +); + +export default router; diff --git a/apps/api/src/policy/rules/approval.ts b/apps/api/src/policy/rules/approval.ts new file mode 100644 index 0000000..12080e3 --- /dev/null +++ b/apps/api/src/policy/rules/approval.ts @@ -0,0 +1,39 @@ +import { db, PolicyAction } from "@repo/db"; +import type { RuleResult } from "../../../types.js"; +import { logger } from "../../../mcp/logger.js"; + +export default async function needsApproval( + tool_name: string, + preFetchedPolicy?: any, +): Promise> { + try { + const policy = + preFetchedPolicy !== undefined + ? preFetchedPolicy + : await db.policy.findUnique({ + where: { tool_name }, + }); + + // Implicit fallback: if no policy is registered or has invalid/missing action, default to APPROVAL + const action = + policy && Object.values(PolicyAction).includes(policy.action) + ? policy.action + : PolicyAction.APPROVAL; + + return { + success: true, + result: action === PolicyAction.APPROVAL, + }; + } catch (error: any) { + logger.error("Database query failed in needsApproval rule", { + tool_name, + error_message: error instanceof Error ? error.message : String(error), + }); + + return { + success: false, + result: false, + reason: "Failed to query policy table", + }; + } +} diff --git a/apps/api/src/policy/rules/block.ts b/apps/api/src/policy/rules/block.ts new file mode 100644 index 0000000..9b2c57a --- /dev/null +++ b/apps/api/src/policy/rules/block.ts @@ -0,0 +1,40 @@ +import { RuleResult } from "../../../types.js"; +import { db, PolicyAction } from "@repo/db"; +import { logger } from "../../../mcp/logger.js"; + +export default async function isblocked( + tool_name: string, + preFetchedPolicy?: any, +): Promise> { + try { + const policy = + preFetchedPolicy !== undefined + ? preFetchedPolicy + : await db.policy.findUnique({ + where: { tool_name }, + }); + + if (policy?.action === PolicyAction.DENY) { + return { + success: true, + result: true, + reason: "Forbidden policy", + }; + } + return { + success: true, + result: false, + }; + } catch (error: any) { + logger.error("Database query failed in isblocked rule", { + tool_name, + error_message: error instanceof Error ? error.message : String(error), + }); + + return { + success: false, + result: false, + reason: "Failed to query policy table", + }; + } +} diff --git a/apps/api/src/policy/rules/budget.ts b/apps/api/src/policy/rules/budget.ts new file mode 100644 index 0000000..956e1b7 --- /dev/null +++ b/apps/api/src/policy/rules/budget.ts @@ -0,0 +1,51 @@ +import { RuleResult } from "../../../types.js"; +import { db } from "@repo/db"; +import { logger } from "../../../mcp/logger.js"; + +export default async function budgetExceeded( + conversationId: string, + token: number, +): Promise> { + try { + // Reject executions without valid conversation context to prevent bypassing budget limits + if (!conversationId || conversationId === "unknown") { + return { + success: false, + result: false, + reason: "Conversation context is missing or unknown", + }; + } + + const conversation = await db.conversation.findUnique({ + where: { id: conversationId }, + }); + + if (!conversation) { + return { + success: false, + result: false, + reason: `Conversation ${conversationId} not found`, + }; + } + + const isExceeded = + conversation.tokens_used + token > conversation.budget_limit; + + return { + success: true, + result: isExceeded, + reason: isExceeded ? "Token budget exceeded" : undefined, + }; + } catch (error: any) { + logger.error("Database query failed in budgetExceeded rule", { + conversation_id: conversationId, + token, + error_message: error instanceof Error ? error.message : String(error), + }); + return { + success: false, + result: false, + reason: "Failed to query conversation table", + }; + } +} diff --git a/apps/api/types.ts b/apps/api/types.ts index 4d6e253..57585b7 100644 --- a/apps/api/types.ts +++ b/apps/api/types.ts @@ -29,4 +29,22 @@ export class AppError extends Error { } } } +import { ApprovalStatus } from "@repo/db"; +export { ApprovalStatus }; + +export interface ApprovalRequest { + tool_name: string; + arguments: Record; + status?: ApprovalStatus; + approvalId?: string; +} +export interface RuleResult { + success: boolean; + result: T; + reason?: string; +} +export interface ConversationRequest { + conversationId: string; + token: number; +} diff --git a/packages/db/README.md b/packages/db/README.md new file mode 100644 index 0000000..b3d2a76 --- /dev/null +++ b/packages/db/README.md @@ -0,0 +1,172 @@ +# Database Package (`@repo/db`) + +This package manages the database access, schemas, migrations, and Prisma Client generation for the **Gatekeeper** system. + +It uses **Prisma ORM** with a **SQLite** database to store policy definitions, log tool executions, manage manual approvals, and track conversation budgets. + +--- + +## Schema Architecture Overview + +The database contains four core tables designed to support policy enforcement, logging, human-in-the-loop approvals, and budget limits. They are independent tables structured to support high-throughput lookups and auditability. + +```mermaid +erDiagram + Log { + String id PK "UUID" + String tool_name "Name of the invoked tool" + Decision decision "Enforced decision (ALLOW | DENY | PENDING | FAILED)" + String reason "Nullable reason for decision" + DateTime createdAt "Timestamp of the execution" + } + + Approval { + String id PK "UUID" + String tool_name "Name of the tool awaiting approval" + Json arguments "JSON arguments passed to the tool" + ApprovalStatus status "Approval state (PENDING | APPROVED | REJECTED)" + DateTime createdAt "Timestamp of creation" + DateTime updatedAt "Timestamp of last update" + } + + Policy { + String id PK "UUID" + String tool_name UK "Unique tool identifier" + PolicyAction action "Policy decision (ALLOW | APPROVAL | DENY)" + String sandbox_path "Nullable restricted access path" + DateTime createdAt "Timestamp of creation" + DateTime updatedAt "Timestamp of last update" + } + + Conversation { + String id PK "UUID" + Int tokens_used "Total tokens used in conversation" + Int budget_limit "Upper limit for tokens" + DateTime createdAt "Timestamp of creation" + } +``` + +--- + +## Detailed Model Definitions + +### 1. `Policy` + +Defines the governing rule for a specific Model Context Protocol (MCP) tool. The policy engine queries this table during tool invocation to determine whether to execute, block, or request human approval. + +| Field | Type | Attributes | Description | +| :------------- | :------------- | :------------------------ | :------------------------------------------------------------------- | +| `id` | `String` | `@id`, `@default(uuid())` | Primary key (UUID). | +| `tool_name` | `String` | `@unique` | The target tool name (e.g., `file-manager-mcp/write_file`). | +| `action` | `PolicyAction` | Enforced Enum | The policy action applied when this tool is requested. | +| `sandbox_path` | `String?` | Optional | Directory path constraint if the tool requires filesystem isolation. | +| `createdAt` | `DateTime` | `@default(now())` | Creation timestamp. | +| `updatedAt` | `DateTime` | `@updatedAt` | Auto-updated modification timestamp. | + +#### Associated Enum: `PolicyAction` + +Controls execution behavior: + +- `ALLOW`: Execute the tool immediately without manual user intervention. +- `APPROVAL`: Suspend execution and queue a human-in-the-loop approval request. +- `DENY`: Block tool execution outright. + +--- + +### 2. `Approval` + +Manages the state of asynchronous human-in-the-loop confirmation requests for tools configured with the `APPROVAL` action. + +| Field | Type | Attributes | Description | +| :---------- | :--------------- | :------------------------ | :---------------------------------------------------- | +| `id` | `String` | `@id`, `@default(uuid())` | Primary key (UUID). | +| `tool_name` | `String` | - | The name of the tool requesting approval. | +| `arguments` | `Json` | - | The structured parameters / inputs sent by the model. | +| `status` | `ApprovalStatus` | Enforced Enum | The current resolution state of the request. | +| `createdAt` | `DateTime` | `@default(now())` | Creation timestamp. | +| `updatedAt` | `DateTime` | `@updatedAt` | Resolution/modification timestamp. | + +#### Associated Enum: `ApprovalStatus` + +- `PENDING`: Waiting for user response (Accept / Deny). +- `APPROVED`: Confirmed by user; tool will proceed to run. +- `REJECTED`: Denied by user; execution aborted. + +--- + +### 3. `Log` + +Acts as an audit trail, keeping records of every decision made by the policy engine and execution outcomes. + +| Field | Type | Attributes | Description | +| :---------- | :--------- | :------------------------ | :--------------------------------------------------------------- | +| `id` | `String` | `@id`, `@default(uuid())` | Primary key (UUID). | +| `tool_name` | `String` | - | Name of the tool evaluated/executed. | +| `decision` | `Decision` | Enforced Enum | The result of the policy engine evaluation. | +| `reason` | `String?` | Optional | Contextual details (e.g., reason for denial, validation errors). | +| `createdAt` | `DateTime` | `@default(now())` | Execution timestamp. | + +#### Associated Enum: `Decision` + +- `ALLOW`: Tool was allowed and executed. +- `DENY`: Tool execution was blocked. +- `PENDING`: Tool execution is paused, waiting for user approval. +- `FAILED`: Execution failed due to a system error or timeout. + +--- + +### 4. `Conversation` + +Tracks API usage tokens and enforces token budgets to prevent runaway loops or excessive resource spend. + +| Field | Type | Attributes | Description | +| :------------- | :--------- | :------------------------ | :------------------------------------------------------- | +| `id` | `String` | `@id`, `@default(uuid())` | Primary key (UUID/Custom conversation identifier). | +| `tokens_used` | `Int` | `@default(0)` | Running counter of tokens consumed by the session. | +| `budget_limit` | `Int` | - | Upper limit of allowed token spend for the conversation. | +| `createdAt` | `DateTime` | `@default(now())` | Session creation timestamp. | + +--- + +## Database Configuration + +The database configuration is managed inside [schema.prisma](./prisma/schema.prisma): + +- **Provider**: SQLite (`sqlite`) +- **Connection URL**: `file:./dev.db` (local development SQLite file inside the `prisma` directory) + +--- + +## Common Workflows & Commands + +To manage and inspect the database, run the following commands from the project root or the package folder: + +### 1. Build and Generate Client + +Generate the type-safe Prisma Client package: + +```sh +npm run build -- --filter=@repo/db +``` + +_Or directly within `/packages/db`:_ + +```sh +npx prisma generate +``` + +### 2. Run Database Migrations + +Apply any schema changes to the local SQLite database: + +```sh +npx prisma migrate dev --name +``` + +### 3. Database Inspection (Prisma Studio) + +Open a visual database explorer locally: + +```sh +npx prisma studio +``` diff --git a/skills/error-handling.md b/skills/error-handling.md index 49f96cf..05dbefa 100644 --- a/skills/error-handling.md +++ b/skills/error-handling.md @@ -11,10 +11,12 @@ This skill establishes the standards for error handling across the Gatekeeper co ## Core Principles ### 1. Never Match Error Messages Against Substrings + - **Vulnerable**: Matching `errMsg.includes("Tool not found")` can be broken if a user inputs that substring inside parameters (such as `toolName` or `decision`). - **Secure**: Use typed and structured error classes that carry meta-properties (e.g. `statusCode`) and check their types or properties directly. ### 2. Use Structured Error Classes + - Throw instances of `AppError` for expected client validation, parameter mismatches, or policy rejections: - `400 Bad Request` for parameter validation (e.g. missing inputs, invalid types). - `403 Forbidden` for policy decision rejections (e.g. non-ALLOW decisions). @@ -22,9 +24,11 @@ This skill establishes the standards for error handling across the Gatekeeper co - Use standard `Error` classes for internal system failures. ### 3. Prevent Information Leakage (CWE-209) + - Do not return detailed internal exceptions, database stack traces, or process errors directly in response bodies for HTTP `500` status codes. - Mask unexpected errors with a generic message: `"Failed to execute tool"`. ### 4. Input Sanitization + - Enforce strict bounds validation (e.g. ranges for timers/timeouts) and apply comparisons against hardcoded constant boundaries rather than trusting client parameters. - Verify types and coerce inputs safely before doing logic checks.