Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
9eb16a3
feat(agents-runtime): Sandbox primitive + tool refactor (PR 6a)
msfstef May 19, 2026
d9a5f31
feat(agents-runtime): nativeSandbox via @anthropic-ai/sandbox-runtime…
msfstef May 19, 2026
920b615
feat(agents-runtime): remoteSandbox + E2B adapter (PR 6c)
msfstef May 19, 2026
95b8f3b
feat(agents): chooseDefaultSandbox + Horton/Worker default to native …
msfstef May 19, 2026
e82a8e7
test(agents-runtime): cross-provider conformance + nativeSandbox OS n…
msfstef May 20, 2026
299fada
chore: update pnpm-lock.yaml for e2b dependency
msfstef May 20, 2026
7a49250
chore: rebase fixups against main
msfstef May 20, 2026
094b5e9
feat(agents-runtime): nativeSandbox.fetch routes through the library …
msfstef May 20, 2026
94633a3
fix(agents-runtime): handle missing native sandbox deps + process-tre…
msfstef May 20, 2026
17de008
fix(agents-runtime): gate sandbox-native.test outer describe on platform
msfstef May 20, 2026
f2b4c18
feat(agents-runtime): extend Sandbox with readdir/exists/remove/stat …
msfstef May 20, 2026
39af78b
fix(agents-runtime): sandbox exists() = false-on-denied + aborted flag
msfstef May 20, 2026
38f5c14
feat(agents-runtime): sandbox getUrl + updateNetworkPolicy (NetworkPo…
msfstef May 20, 2026
bf23e2d
feat(agents-runtime): dockerSandbox adapter (dockerode-based, hardened)
msfstef May 20, 2026
271e292
test(agents-runtime): conformance + KNOWN_ADAPTERS enforcement for do…
msfstef May 20, 2026
13025d8
fix(agents-runtime): docker read-policy, proxy SSRF guards, native al…
msfstef May 20, 2026
ceee2cd
refactor(agents-runtime): remove nativeSandbox adapter
msfstef May 20, 2026
7ea5b52
Remove plans
msfstef May 20, 2026
2468714
feat(agents-runtime): per-wake-session sandbox lifecycle
msfstef May 21, 2026
6babecf
refactor(agents-runtime): move docker exports to /sandbox/docker subpath
msfstef May 21, 2026
90dd2c8
feat(agents): sandbox profile picker + per-runner advertisement
msfstef May 21, 2026
d982b21
chore: reconcile pnpm-lock after rebase onto main
msfstef May 25, 2026
91b0613
docs(agents-runtime): drop references to removed sandbox-design.md
msfstef May 25, 2026
06aedc5
feat(agents): shared sandboxes for collaborating entities
msfstef May 25, 2026
ac1690a
feat(agents-desktop): offer the docker sandbox profile
msfstef May 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .changeset/agents-runtime-sandbox-primitive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
'@electric-ax/agents-runtime': minor
'@electric-ax/agents': minor
'@electric-ax/agents-server-conformance-tests': patch
---

Adds the `Sandbox` primitive (`@electric-ax/agents-runtime/sandbox`) for isolating LLM-driven tool calls. Three providers ship: `unrestrictedSandbox()` (explicit pass-through), `remoteSandbox({provider: 'e2b'})` (E2B as an optional peer dep), and `dockerSandbox()` (container isolation via `dockerode` as an optional peer dep).

Built-in entities (Horton, Worker) default to `unrestrictedSandbox` via the new `chooseDefaultSandbox(workingDirectory)` helper. Stronger isolation is opt-in by constructing `dockerSandbox` or `remoteSandbox` directly — `dockerSandbox` is the recommended path for multi-entity hosting.

Behavior changes folded in: bash no longer forwards `process.env` to children (closes `$ANTHROPIC_API_KEY` exfil), tool descriptions corrected, and read/write/edit reject symlink escapes from the workspace.

Runtimes advertise named **sandbox profiles** (e.g. `local`, `docker`) to the agents-server; spawn requests pick a profile by name, the server validates the choice against the target runner's advertised set, and the new-session UI surfaces a picker. `createFetchUrlTool` and the other tool factories now require a `Sandbox` parameter.
1 change: 1 addition & 0 deletions packages/agents-desktop/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
"@electric-sql/client": "^1.5.18",
"@mixmark-io/domino": "^2.2.0",
"better-sqlite3": "^12.9.0",
"dockerode": "^5.0.0",
"fix-path": "^4.0.0",
"jsdom": "^28.1.0",
"pino": "^10.3.1",
Expand Down
6 changes: 6 additions & 0 deletions packages/agents-desktop/vite.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@ const MUST_EXTERNALIZE = new Set([
`jsdom`,
`pino`,
`pino-pretty`,
// `inlineDynamicImports` would inline the lazy `dockerode` import (and its
// native `ssh2`/`cpu-features` deps), which rollup can't bundle. Externalize
// the chain: it's an optional runtime dep, gracefully absent otherwise.
`dockerode`,
`ssh2`,
`cpu-features`,
])

function externalizeBareImports(
Expand Down
31 changes: 31 additions & 0 deletions packages/agents-runtime/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,32 @@
"default": "./dist/tools.cjs"
}
},
"./sandbox": {
"import": {
"types": "./dist/sandbox.d.ts",
"default": "./dist/sandbox.js"
},
"require": {
"types": "./dist/sandbox.d.cts",
"default": "./dist/sandbox.cjs"
}
},
"./sandbox/docker": {
"import": {
"types": "./dist/sandbox-docker.d.ts",
"default": "./dist/sandbox-docker.js"
},
"require": {
"types": "./dist/sandbox-docker.d.cts",
"default": "./dist/sandbox-docker.cjs"
}
},
"./package.json": "./package.json"
},
"peerDependencies": {
"@tanstack/react-db": ">=0.1.78",
"dockerode": ">=4.0.0",
"e2b": ">=2.0.0",
"react": ">=18"
},
"peerDependenciesMeta": {
Expand All @@ -76,6 +98,12 @@
},
"@tanstack/react-db": {
"optional": true
},
"dockerode": {
"optional": true
},
"e2b": {
"optional": true
}
},
"dependencies": {
Expand All @@ -96,15 +124,18 @@
"pino-pretty": "^13.0.0",
"turndown": "^7.2.2",
"turndown-plugin-gfm": "^1.0.2",
"undici": "6.25.0",
"zod": "^4.3.6",
"zod-to-json-schema": "^3.25.2"
},
"devDependencies": {
"@durable-streams/server": "https://pkg.pr.new/durable-streams/durable-streams/@durable-streams/server@eac712f",
"@types/dockerode": "^4.0.1",
"@types/jsdom": "^27.0.0",
"@types/node": "^22.19.15",
"@types/turndown": "^5.0.6",
"@vitest/coverage-v8": "^3.2.4",
"dockerode": "^5.0.0",
"tsdown": "^0.9.0",
"typescript": "^5.7.0",
"vitest": "^3.2.4"
Expand Down
3 changes: 3 additions & 0 deletions packages/agents-runtime/src/context-factory.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import { createContextTools } from './tools/context-tools'
import { CACHE_TIERS } from './types'
import { composeToolsWithProviders } from './tool-providers'
import type { ChangeEvent } from '@durable-streams/state'
import type { Sandbox } from './sandbox/types'
import type {
AgentConfig,
AgentHandle,
Expand Down Expand Up @@ -63,6 +64,7 @@ export interface HandlerContextConfig<TState extends StateProxy = StateProxy> {
state: TState
actions: Record<string, (...args: Array<unknown>) => unknown>
electricTools: Array<AgentTool>
sandbox: Sandbox
events: Array<ChangeEvent>
writeEvent: (event: ChangeEvent) => void
wakeSession: WakeSession
Expand Down Expand Up @@ -526,6 +528,7 @@ export function createHandlerContext<TState extends StateProxy = StateProxy>(
actions: config.actions,
electricTools: config.electricTools,
signal: config.runSignal ?? new AbortController().signal,
sandbox: config.sandbox,
useAgent(cfg) {
agentConfig = cfg
return agent
Expand Down
56 changes: 52 additions & 4 deletions packages/agents-runtime/src/create-handler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import { passthrough } from './entity-schema'
import { runtimeLog } from './log'
import { appendPathToUrl } from './url'
import { verifyWebhookSignature } from './webhook-signature'
import type { SandboxProfile } from './sandbox/types'
import type { EntityRegistry } from './define-entity'
import type { IncomingMessage, ServerResponse } from 'node:http'
import type { WebhookSignatureVerifierConfig } from './webhook-signature'
Expand Down Expand Up @@ -93,6 +94,15 @@ export interface RuntimeRouterConfig {
onWakeError?: (error: Error) => boolean | void
/** Max number of concurrent entity-type registrations (default: 8). */
registrationConcurrency?: number
/**
* Sandbox profiles registered by this runtime. Each profile is a
* `(name, label, description?, factory)` tuple — the factory stays
* local to the runtime; only the descriptive fields are advertised
* to the agents-server (via the runner registration) and surfaced
* in the UI picker. Spawn payloads pass `sandbox.profile` and the
* server validates against the target runner's advertised set.
*/
sandboxProfiles?: ReadonlyArray<SandboxProfile>
/**
* Public URL of this runtime, forwarded to the agents-server so it can be
* included in GET /api/runtimes. If omitted the runtime is registered but
Expand Down Expand Up @@ -149,6 +159,18 @@ export interface RuntimeRouter {
/** Names of all registered entity types */
readonly typeNames: Array<string>

/**
* Wire-shape descriptors for sandbox profiles registered on this
* runtime. Used by the runner registration to advertise the profile
* set to the agents-server (factory closures are intentionally not
* included).
*/
readonly sandboxProfileDescriptors: Array<{
name: string
label: string
description?: string
}>

/** Register all entity types with the durable streams server */
registerTypes: () => Promise<void>
}
Expand Down Expand Up @@ -191,17 +213,31 @@ export function createRuntimeRouter(
webhookSignature,
} = normalized

const getRegisteredType = (name: string) =>
registry ? registry.get(name) : getEntityType(name)
const getRegisteredTypes = () =>
registry ? registry.list() : listEntityTypes()

// Index the runtime's profiles by name. Duplicate names are a
// configuration bug — fail fast rather than silently dropping one.
const sandboxProfiles = new Map<string, SandboxProfile>()
for (const profile of config.sandboxProfiles ?? []) {
if (sandboxProfiles.has(profile.name)) {
throw new Error(
`[agent-runtime] duplicate sandbox profile name "${profile.name}" registered on createRuntimeRouter`
)
}
sandboxProfiles.set(profile.name, profile)
}

const wakeConfig: ProcessWakeConfig = {
baseUrl,
registry,
createElectricTools,
idleTimeout,
heartbeatInterval,
sandboxProfiles,
}
const getRegisteredType = (name: string) =>
registry ? registry.get(name) : getEntityType(name)
const getRegisteredTypes = () =>
registry ? registry.list() : listEntityTypes()
const debugRegistrationTiming =
process.env.ELECTRIC_AGENTS_DEBUG_REGISTRATION_TIMING === `1`
const pendingWakes = new Set<Promise<void>>()
Expand Down Expand Up @@ -536,6 +572,16 @@ export function createRuntimeRouter(
}
}

const sandboxProfileDescriptors = [...sandboxProfiles.values()].map(
(profile) => ({
name: profile.name,
label: profile.label,
...(profile.description !== undefined && {
description: profile.description,
}),
})
)

return {
handleRequest,
handleWebhookRequest,
Expand All @@ -548,6 +594,7 @@ export function createRuntimeRouter(
get typeNames() {
return getRegisteredTypes().map((entry) => entry.name)
},
sandboxProfileDescriptors,
registerTypes,
}
}
Expand Down Expand Up @@ -595,6 +642,7 @@ export function createRuntimeHandler(
get typeNames() {
return router.typeNames
},
sandboxProfileDescriptors: router.sandboxProfileDescriptors,
registerTypes: router.registerTypes,
}
}
Expand Down
55 changes: 55 additions & 0 deletions packages/agents-runtime/src/process-wake.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@ import { createHandlerContext } from './context-factory'
import { createSetupContext } from './setup-context'
import { createEntityLogPrefix, runtimeLog } from './log'
import { createRuntimeServerClient } from './runtime-server-client'
import { unrestrictedSandbox } from './sandbox/unrestricted'
import { appendPathToUrl } from './url'
import { SandboxError } from './sandbox/types'
import type { Sandbox } from './sandbox/types'
import type {
CronObservationSource,
EntitiesObservationSource,
Expand All @@ -26,6 +29,7 @@ import type {
ProcessWakeConfig,
SendResult,
SharedStateSchemaMap,
SpawnSandboxOption,
Wake,
WakeEvent,
WakeMessage,
Expand Down Expand Up @@ -461,6 +465,10 @@ export async function processWake(
let finalError: Error | AggregateError | null = null
let shutdownRequested = shutdownSignal?.aborted ?? false
let ackCurrentWakeOnFailure = false
// Sandbox is acquired once per wake-session (after entityArgs is known)
// and released/disposed in the outer finally. Lives at function scope so
// both the try and finally can see it.
let sandbox: Sandbox | null = null

// Live event handler — wired after preload, processes child_status + inbox
let idleTimer: ReturnType<typeof setTimeout> | null = null
Expand Down Expand Up @@ -1129,6 +1137,35 @@ export async function processWake(

const entityArgs = Object.freeze(notification.entity?.spawnArgs ?? {})

// Sandbox is a per-runner concern: profiles live on the runner's
// advertisement (validated server-side at spawn time). The
// wake-time job is just to look up the chosen profile by name.
// When no profile was picked at spawn we fall back to an
// in-process unrestricted sandbox at the host's cwd — matches the
// pre-profiles default and keeps tests/dev simple.
const requestedProfileName = notification.entity?.sandbox?.profile
if (requestedProfileName) {
const profile = config.sandboxProfiles?.get(requestedProfileName)
if (!profile) {
throw new SandboxError(
`unavailable`,
`[agent-runtime] sandbox profile "${requestedProfileName}" requested for entity "${entityUrl}" is not registered on this runtime. Available profiles: ${[...(config.sandboxProfiles?.keys() ?? [])].join(`, `) || `(none)`}.`
)
}
// A shared key (resolved server-side) lets several entities share one
// sandbox; absent it, the identity is the entity's own URL.
const sharedKey = notification.entity?.sandbox?.key
sandbox = await profile.factory({
sandboxKey: sharedKey ?? entityUrl,
shared: sharedKey != null,
entityUrl,
entityType: typeName,
args: entityArgs,
})
} else {
sandbox = await unrestrictedSandbox({ workingDirectory: process.cwd() })
}

// ---- Send executor — ctx.send() calls this directly (no queue) ----
const executeSend = (send: {
targetUrl: string
Expand Down Expand Up @@ -1175,6 +1212,7 @@ export async function processWake(
initialMessage?: unknown
wake?: Wake
tags?: Record<string, string>
sandbox?: SpawnSandboxOption
}
): Promise<{ entityUrl: string; streamPath: string }> => {
const wakeOpt = opts?.wake
Expand All @@ -1200,13 +1238,19 @@ export async function processWake(
: undefined,
}
: undefined
// `'inherit'` is sugar for reusing the parent's shared sandbox.
const sandbox =
opts?.sandbox === `inherit`
? { inherit: true as const }
: opts?.sandbox
return serverClient.spawnEntity({
type: childType,
id: childId,
args: spawnArgs,
parentUrl,
initialMessage: opts?.initialMessage,
tags: opts?.tags,
sandbox,
wake: wakeOpt,
})
},
Expand Down Expand Up @@ -1840,6 +1884,10 @@ export async function processWake(
events: currentWakeEvents,
actions: setupCtx.actions,
electricTools,
// Non-null at this point: the sandbox was acquired earlier in
// this try block (after entityArgs); TS narrowing doesn't survive
// the surrounding for-loop, so assert.
sandbox: sandbox!,
writeEvent,
wakeSession,
wakeEvent: currentWakeEvent,
Expand Down Expand Up @@ -2083,6 +2131,13 @@ export async function processWake(
}
}
db.close()
if (sandbox) {
try {
await sandbox.dispose()
} catch (err) {
cleanupErrors.push(toError(err))
}
}
if (claimedWake) {
log.info(
doneOffset === `-1`
Expand Down
4 changes: 4 additions & 0 deletions packages/agents-runtime/src/runtime-server-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ export interface SpawnEntityOptions {
parentUrl?: string
initialMessage?: unknown
tags?: Record<string, string>
/** Sandbox selection — shared `key` and/or `inherit` the parent's sandbox. */
sandbox?: { profile?: string; key?: string; inherit?: boolean }
dispatch_policy?: DispatchPolicy
wake?: {
subscriberUrl: string
Expand Down Expand Up @@ -287,6 +289,7 @@ export function createRuntimeServerClient(
parentUrl,
initialMessage,
tags,
sandbox,
dispatch_policy,
wake,
}: SpawnEntityOptions): Promise<RuntimeEntityInfo> => {
Expand All @@ -295,6 +298,7 @@ export function createRuntimeServerClient(
if (parentUrl !== undefined) body.parent = parentUrl
if (initialMessage !== undefined) body.initialMessage = initialMessage
if (tags && Object.keys(tags).length > 0) body.tags = tags
if (sandbox !== undefined) body.sandbox = sandbox
if (dispatch_policy !== undefined) body.dispatch_policy = dispatch_policy
if (wake !== undefined) body.wake = wake

Expand Down
16 changes: 16 additions & 0 deletions packages/agents-runtime/src/sandbox-docker.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
/**
* Docker sandbox provider as a separate subpath export so callers that
* only need the in-process `unrestrictedSandbox` (e.g. desktop renderers
* bundled by Vite) don't pull `dockerode` and its native dependencies
* (`cpufeatures.node`, etc.) into their bundle. Import from
* `@electric-ax/agents-runtime/sandbox/docker` only when actually using
* the docker provider.
*/

export {
dockerSandbox,
reapIdleDockerSandboxes,
__resetPersistentRegistryForTests,
} from './sandbox/docker'
export type { DockerSandboxOpts } from './sandbox/docker'
export { isDockerAvailable } from './sandbox/docker/loader'
Loading