Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ jobs:
python examples/basic_cli.py
python examples/billing_demo.py
python examples/http_driver_demo.py
python examples/tutorial.py

conformance_stub:
name: "Weaver Spec Conformance Stub (v0.1.0)"
Expand Down
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Added
- "Secure your first MCP tool in 5 minutes" tutorial: new
[`docs/tutorial.md`](docs/tutorial.md) walks a new reader from install to a
working invocation, covering registration, principals, grants, the three
LLM-safe response modes (`summary` / `table` / `handle_only`), handle
expansion, policy denial with stable `reason_code`, and `explain()`
audit. The admin-only `raw` mode is described but not exercised by the
walkthrough. Companion runnable example
[`examples/tutorial.py`](examples/tutorial.py) uses `InMemoryDriver`
(offline, zero external deps) and is exercised by `make example` and CI;
it now `assert`s that no PII field leaks into the LLM-safe Frame so a
firewall regression fails the build. (#46)
- README "How this relates to neighboring projects" section: a neutral
boundaries table covering `AgentFence` (external CLI/proxy gate),
`contextweaver` (context compilation library), `ChainWeaver`
(deterministic flow orchestrator), and `weaver-spec` (specification +
conformance suite), plus a "When *not* to use this" callout. (#71)

## [0.7.0] - 2026-05-20

### Added
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ example:
python examples/basic_cli.py
python examples/billing_demo.py
python examples/http_driver_demo.py
python examples/tutorial.py

ci: fmt lint type test example
41 changes: 41 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ pip install weaver-kernel

> **Note:** The PyPI package is `weaver-kernel` (Weaver ecosystem), but the Python import remains `agent_kernel`.

> **New here?** [docs/tutorial.md](docs/tutorial.md) walks through register → grant → invoke → expand → explain in five minutes.

```python
import asyncio, os
os.environ["AGENT_KERNEL_SECRET"] = "my-secret"
Expand Down Expand Up @@ -110,6 +112,45 @@ asyncio.run(main())

`agent-kernel` sits **above** `contextweaver` (context compilation) and **above** raw tool execution. It provides the authorization, execution, and audit layer.

## How this relates to neighboring projects

`agent-kernel` is the embeddable runtime layer of the **Weaver ecosystem**. The
projects below solve adjacent problems and are designed to compose, not to
overlap.

| Project | Role | Where it runs | Use it when… |
|---|---|---|---|
| **agent-kernel** *(this repo)* | Embeddable library/runtime: capability registry, policy, HMAC tokens, context firewall, audit trace. | In-process inside your agent host. | You need authorization, redaction, and audit between an LLM loop and a large tool ecosystem. |
| [**AgentFence**](https://github.com/dgenio/AgentFence) | External CLI / local proxy that intercepts tool calls and applies a policy gate. | Out-of-process, alongside your agent. | You want a policy boundary without changing your agent code, or you need to gate a third-party agent host you can't modify. |
| [**contextweaver**](https://github.com/dgenio/contextweaver) | Library that selects and compiles the context an LLM receives. | In-process, before the LLM call. | You need to assemble relevant context for a prompt. It sits *under* the LLM loop; agent-kernel sits *between* the LLM and tools. |
| **ChainWeaver** | Orchestrator for deterministic tool chains. | In-process or as a separate service. | You need to run a multi-step deterministic flow rather than free-form LLM tool use. |
| [**weaver-spec**](https://github.com/dgenio/weaver-spec) | Specification: invariants, capability/token/frame contracts, conformance suite. | Not a runtime — it's docs + a contract test suite. | You're building another Weaver-compatible implementation, or you want to verify an existing one. |

A minimal architecture using `agent-kernel` as the central runtime:

```
LLM / agent loop
contextweaver ─► agent-kernel ─► driver ─► MCP / HTTP / A2A / internal API
ActionTrace
```

### When *not* to use this

- You only need a process-level policy gate around an existing agent host —
reach for `AgentFence` instead.
- You only need to compile context for a prompt — use `contextweaver`.
- You want a deterministic, scripted workflow with no LLM in the inner loop —
use `ChainWeaver`.
- You're writing a static analyzer or one-shot CLI scanner with no
per-invocation runtime — `agent-kernel` would be overkill.

See [docs/tutorial.md](docs/tutorial.md) for an end-to-end "secure your first
MCP tool in 5 minutes" walkthrough.

## Weaver Spec Compatibility: v0.1.0

agent-kernel is a compliant implementation of [weaver-spec v0.1.0](https://github.com/dgenio/weaver-spec).
Expand Down
280 changes: 280 additions & 0 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,280 @@
# Secure your first MCP tool in 5 minutes

This walkthrough takes a brand-new reader from `pip install` to a working,
authorized, audited tool invocation in roughly five minutes. Every code block
is copy-pasteable; the runnable companion is
[`examples/tutorial.py`](../examples/tutorial.py) (covered by CI).

> The PyPI package is **`weaver-kernel`** but the Python import is
> **`agent_kernel`**. We use both names in this document.

## What you'll learn

By the end of this page you will have seen, in this order:

1. How to register a **capability** and how its `safety_class`,
`sensitivity`, and `allowed_fields` shape authorization.
2. How a **principal** is created and why some attributes (like `tenant`)
are required for PII-tagged capabilities.
3. How to issue a signed **token** with `kernel.get_token(...)`.
4. How `kernel.invoke(...)` returns a bounded **Frame** in `summary`,
`table`, or `handle_only` modes — and why `email` never appears in any
of them.
5. How to retrieve filtered raw rows by expanding a **Handle**.
6. What a **policy denial** looks like and how to branch on its stable
`reason_code`.
7. How `kernel.explain(action_id)` returns an audit **ActionTrace**.
8. How to swap the in-process driver for a real **MCP** server.

## 0. Install

```bash
pip install weaver-kernel
```

For the MCP section near the end, also install the optional extra:

```bash
pip install "weaver-kernel[mcp]"
```

Set a stable HMAC secret for the process. In production this should come
from a real secret store; the example uses a fixed value so the output is
reproducible:

```python
import os
os.environ["AGENT_KERNEL_SECRET"] = "tutorial-secret-do-not-use-in-prod"
```

## 1. Register a capability

A capability is the unit of authorization. The `safety_class` controls
which roles may call it. The `sensitivity` tag tells the policy and
firewall how to treat the data. `allowed_fields` is the projection the
firewall applies before any row reaches the LLM.

```python
from agent_kernel import (
Capability,
CapabilityRegistry,
ImplementationRef,
SafetyClass,
SensitivityTag,
)

registry = CapabilityRegistry()
registry.register(
Capability(
capability_id="billing.invoices.list",
name="List Invoices",
description="List recent invoices",
safety_class=SafetyClass.READ,
sensitivity=SensitivityTag.PII,
allowed_fields=["id", "customer_name", "amount", "status"],
tags=["billing", "invoices", "list"],
impl=ImplementationRef(driver_id="memory", operation="list_invoices"),
)
)
```

> `email`, `phone`, and other non-listed columns will never reach the LLM
> even if the driver returns them.

## 2. Wire a driver and the kernel

`InMemoryDriver` keeps the tutorial offline. The same pattern works with
`HTTPDriver` or `MCPDriver` — see step 8.

```python
from agent_kernel import HMACTokenProvider, InMemoryDriver, Kernel, StaticRouter
from agent_kernel.drivers.base import ExecutionContext

INVOICES = [
{"id": "INV-001", "customer_name": "Alice", "email": "alice@example.com", "amount": 120.0, "status": "paid"},
{"id": "INV-002", "customer_name": "Bob", "email": "bob@example.com", "amount": 540.0, "status": "unpaid"},
{"id": "INV-003", "customer_name": "Carol", "email": "carol@example.com", "amount": 75.0, "status": "paid"},
]

driver = InMemoryDriver()
driver.register_handler("list_invoices", lambda ctx: list(INVOICES))

kernel = Kernel(
registry=registry,
token_provider=HMACTokenProvider(secret="tutorial-secret-do-not-use-in-prod"),
router=StaticRouter(routes={"billing.invoices.list": ["memory"]}),
)
kernel.register_driver(driver)
```

## 3. Create a principal

The `DefaultPolicyEngine` requires a `tenant` attribute on the principal
for any PII-tagged capability. Without it, the grant is denied with
`reason_code="missing_tenant_attribute"`.

```python
from agent_kernel import Principal

alice = Principal(principal_id="alice", roles=["reader"], attributes={"tenant": "acme"})
```

## 4. Grant a token

`get_token` runs the policy engine and returns a signed
`CapabilityToken`. No token, no invocation.

```python
from agent_kernel.models import CapabilityRequest

request = CapabilityRequest(capability_id="billing.invoices.list", goal="list recent invoices")
token = kernel.get_token(request, alice, justification="")
print(token.token_id, token.expires_at)
```

## 5. Invoke and observe the Frame

The default `response_mode` is `"summary"`. The Frame holds compact
facts about the data plus a Handle the LLM can expand later.

```python
import asyncio

frame = asyncio.run(kernel.invoke(token, principal=alice, args={"operation": "list_invoices"}))
for fact in frame.facts:
print("•", fact)
print("handle:", frame.handle and frame.handle.handle_id)
```

Try `response_mode="table"` to get a row preview that respects
`allowed_fields`. Try `response_mode="handle_only"` to skip the preview
entirely — the LLM gets only a reference. In every mode, **`email` is
absent** from the Frame, because it is not in `allowed_fields`.

```python
table_frame = asyncio.run(
kernel.invoke(
kernel.get_token(request, alice, justification=""),
principal=alice,
args={"operation": "list_invoices"},
response_mode="table",
)
)
assert all("email" not in row for row in table_frame.table_preview)
```

## 6. Expand a Handle

Handles let the LLM stay inside its context budget while still pulling
specific rows or fields on demand. The expand query supports `offset`,
`limit`, `fields`, and an equality `filter`.

```python
handle_frame = asyncio.run(
kernel.invoke(
kernel.get_token(request, alice, justification=""),
principal=alice,
args={"operation": "list_invoices"},
response_mode="handle_only",
)
)
expanded = kernel.expand(
handle_frame.handle,
query={"offset": 0, "limit": 2, "fields": ["id", "amount"]},
)
print(expanded.table_preview)
# [{'id': 'INV-001', 'amount': 120.0}, {'id': 'INV-002', 'amount': 540.0}]
```

> **Where the security boundary is today.** The `Firewall` enforces
> `allowed_fields` when it builds the `summary` and `table` previews, so
> disallowed columns never reach the LLM-safe Frame. `HandleStore.expand()`
> currently filters by whatever `fields` the caller passes in the query
> against the stored raw rows — it does **not yet** re-apply the
> capability's `allowed_fields` projection. Until the in-flight grant
> constraint work lands (tracking issue
> [#76](https://github.com/dgenio/agent-kernel/issues/76), PR
> [#79](https://github.com/dgenio/agent-kernel/pull/79)), treat handle
> expansion as authorized-but-field-unconstrained: only request `fields`
> the caller is allowed to see.

## 7. Watch policy enforcement

Add a WRITE capability and try to call it as the reader principal. The
denial carries both a human-readable `reason` and a stable
`reason_code` your code can branch on.

```python
from agent_kernel.errors import PolicyDenied

registry.register(
Capability(
capability_id="billing.invoices.create",
name="Create Invoice",
description="Create a new invoice",
safety_class=SafetyClass.WRITE,
tags=["billing", "invoices", "create"],
impl=ImplementationRef(driver_id="memory", operation="create_invoice"),
)
)

try:
kernel.get_token(
CapabilityRequest(capability_id="billing.invoices.create", goal="create an invoice"),
alice,
justification="reader trying a write — should fail",
)
except PolicyDenied as exc:
print(exc.reason_code) # 'missing_role'
print(str(exc)) # "WRITE capabilities require the 'writer' or 'admin' role..."
```

Stable reason codes come from `agent_kernel.policy_reasons.DenialReason`.
Tests should assert on the code, not on the human-readable string.

## 8. Audit with `explain()`

Every successful invocation creates an `ActionTrace` keyed by
`frame.action_id`. The trace records who, what, when, and which driver
served the request — the auditable half of weaver-spec invariant I-02.

```python
trace = kernel.explain(frame.action_id)
print(trace.action_id, trace.capability_id, trace.principal_id, trace.driver_id)
```

## 9. Swap the driver for an MCP server

The kernel doesn't care whether the driver lives in-process, behind
HTTP, or behind an MCP server — capabilities, policy, tokens, and
firewall behave identically. To talk to a real MCP server, replace
`InMemoryDriver` with `MCPDriver` (full transport details, including
Streamable HTTP, live in [`docs/integrations.md`](integrations.md)):

```python
from agent_kernel.drivers.mcp import MCPDriver

driver = MCPDriver.from_stdio(
command="python",
args=["-m", "my_mcp_server"],
server_name="local-tools",
)
kernel.register_driver(driver)

# Discover the MCP server's tools and register each as an agent-kernel
# capability under a namespace. Set safety_class/sensitivity/allowed_fields
# on the resulting Capability objects to apply policy and the firewall.
capabilities = asyncio.run(driver.discover(namespace="billing"))
registry.register_many(capabilities)
```

That's the whole tutorial. From here:

- [`docs/security.md`](security.md) — threat model, what HMAC tokens do
and do not protect against.
- [`docs/context_firewall.md`](context_firewall.md) — redaction,
summarization, and budget details.
- [`docs/capabilities.md`](capabilities.md) — designing capabilities
for large tool ecosystems.
- [`docs/integrations.md`](integrations.md) — full MCP and HTTP driver
integration patterns.
Loading
Loading