Skip to content

feat(registry,federation): namespaces, BM25 search, capability manifests#82

Open
dgenio wants to merge 2 commits into
mainfrom
claude/triage-issues-5Bt4u
Open

feat(registry,federation): namespaces, BM25 search, capability manifests#82
dgenio wants to merge 2 commits into
mainfrom
claude/triage-issues-5Bt4u

Conversation

@dgenio
Copy link
Copy Markdown
Owner

@dgenio dgenio commented May 21, 2026

Implements the capability-discovery group from triage: #45 (namespaces &
hierarchical discovery) + #52 (marketplace part 1 — manifest format &
local registry). Both extend the same discovery seam and share models.py.

#45 — CapabilityRegistry

  • Dot-notation capability_ids now expose list_namespaces() and
    list_namespace(prefix). Flat IDs continue to work unchanged.
  • register_namespace(prefix, loader=...) enables deferred registration
    for large tool ecosystems; the loader runs at most once on first
    access (search/list/get).
  • search() gained an offset kwarg, strips a small stop-word set,
    and now scores with a BM25-flavoured ranker that weights matches on
    capability_id and tags above description. Determinism preserved
    via stable capability_id tie-breaking — required by AGENTS.md.

#52 — Capability marketplace (local)

  • New CapabilityDescriptor and CapabilityManifest dataclasses with
    to_dict/from_dict for JSON portability. Internal driver IDs and
    operation names are stripped on serialisation.
  • New agent_kernel.federation module: build_manifest(),
    import_manifest(), merge_sensitivity(). Three trust policies
    honoured at import time (most_restrictive default, local_only,
    remote_deferred).
  • Kernel.advertise(endpoint=...) and Kernel.import_remote(manifest, driver=..., trust_policy=...) thin wrappers; Kernel gained a
    kernel_id kwarg used as the publisher identity (defaults to
    "agent-kernel").
  • Imported capabilities flow through the full local pipeline (policy
    → token → firewall → trace). HMAC tokens remain kernel-scoped — a
    token from a kernel with a different secret fails signature check.

New errors in errors.py: NamespaceNotFound, FederationError,
ManifestError, TrustPolicyError.

Tests: +32 net (450 → 482). New tests/test_federation.py covers
manifest round-trip, descriptor stripping, all three trust policies,
end-to-end imported-capability invocation, and kernel-scoped HMAC
isolation. tests/test_registry.py extended with namespace ops,
deferred-loader semantics, BM25 ranking, pagination, stop-word
stripping, and a 500-capability scale sanity check.

Docs: new docs/federation.md, namespace section added to
docs/capabilities.md, CHANGELOG [Unreleased] entries.

Out of scope (left for follow-ups, per the triage report):

Closes #45
Closes #52

https://claude.ai/code/session_015PFMci84T2TBNhqKsV3Hyo

claude added 2 commits May 21, 2026 05:46
Implements the capability-discovery group from triage: #45 (namespaces &
hierarchical discovery) + #52 (marketplace part 1 — manifest format &
local registry). Both extend the same discovery seam and share models.py.

#45 — CapabilityRegistry
- Dot-notation `capability_id`s now expose `list_namespaces()` and
  `list_namespace(prefix)`. Flat IDs continue to work unchanged.
- `register_namespace(prefix, loader=...)` enables deferred registration
  for large tool ecosystems; the loader runs at most once on first
  access (search/list/get).
- `search()` gained an `offset` kwarg, strips a small stop-word set,
  and now scores with a BM25-flavoured ranker that weights matches on
  `capability_id` and `tags` above `description`. Determinism preserved
  via stable `capability_id` tie-breaking — required by AGENTS.md.

#52 — Capability marketplace (local)
- New `CapabilityDescriptor` and `CapabilityManifest` dataclasses with
  `to_dict`/`from_dict` for JSON portability. Internal driver IDs and
  operation names are stripped on serialisation.
- New `agent_kernel.federation` module: `build_manifest()`,
  `import_manifest()`, `merge_sensitivity()`. Three trust policies
  honoured at import time (`most_restrictive` default, `local_only`,
  `remote_deferred`).
- `Kernel.advertise(endpoint=...)` and `Kernel.import_remote(manifest,
  driver=..., trust_policy=...)` thin wrappers; `Kernel` gained a
  `kernel_id` kwarg used as the publisher identity (defaults to
  `"agent-kernel"`).
- Imported capabilities flow through the full local pipeline (policy
  → token → firewall → trace). HMAC tokens remain kernel-scoped — a
  token from a kernel with a different secret fails signature check.

New errors in `errors.py`: `NamespaceNotFound`, `FederationError`,
`ManifestError`, `TrustPolicyError`.

Tests: +32 net (450 → 482). New `tests/test_federation.py` covers
manifest round-trip, descriptor stripping, all three trust policies,
end-to-end imported-capability invocation, and kernel-scoped HMAC
isolation. `tests/test_registry.py` extended with namespace ops,
deferred-loader semantics, BM25 ranking, pagination, stop-word
stripping, and a 500-capability scale sanity check.

Docs: new `docs/federation.md`, namespace section added to
`docs/capabilities.md`, CHANGELOG `[Unreleased]` entries.

Out of scope (left for follow-ups, per the triage report):
- Discovery over a network transport (#51, marketplace part 2).
- Decomposition of `policy_dsl.py` / `kernel.py` / `models.py`
  beyond AGENTS.md's 300-line budget (#68).

Closes #45
Closes #52

https://claude.ai/code/session_015PFMci84T2TBNhqKsV3Hyo
…_dsl decomposition

Implements the five remaining triage-report issues in one PR, building on
top of PR #81's registry/federation foundation (which closes #45 + #52
and is included transitively here).

Issue #38 — OpenTelemetry integration
- New `agent_kernel.otel` module: `instrument_kernel(kernel)` wraps
  `Kernel.invoke` and `Kernel.grant_capability` with OTel spans
  (`agent_kernel.invoke`, `agent_kernel.grant`) and metrics
  (`agent_kernel.invocations`, `agent_kernel.invocation_duration`,
  `agent_kernel.policy_denials`). Idempotent, instance-scoped.
- No-op (zero runtime cost) when the `[otel]` extra is not installed;
  `OTEL_AVAILABLE` exposes the runtime status.
- Optional `tracer_provider` / `meter_provider` kwargs let tests bypass
  the global-singleton lock.

Issue #47 — Streaming firewall
- New `Firewall.apply_stream()` async iterator processes chunks
  one-at-a-time with per-chunk PII redaction.
- New `StreamingDriver` Protocol in `drivers/base.py` (runtime-checkable)
  adds optional `execute_stream()` to the driver surface.
- New `Kernel.invoke_stream()` yields `Frame` chunks; the last has
  `is_final=True`. Non-streaming drivers automatically fall back to a
  single-chunk stream via `execute()`.
- `Frame` gained `is_final: bool` (default `False`); single-shot
  `Kernel.invoke` returns `Frame(..., is_final=True)`.

Issue #51 — Federated discovery (closes #49)
- New `agent_kernel.federation_discovery` module:
  `discover_peers(peer_urls=..., registry_url=...)` fetches manifests
  over HTTP, `sign_manifest` / `verify_manifest` for HMAC-SHA256 signed
  envelopes, `serve_manifest_payload` as a framework-agnostic helper for
  manifest-serving endpoints, `DiscoveryRateLimiter` (default 10/60s).
- New `Kernel.discover_peers()` method.
- Asymmetric signing strictness: passing a `secret` *requires* signed
  manifests; omitting it *rejects* signed envelopes — no silent
  downgrade attack surface.
- New errors `ManifestSignatureError` and `DiscoveryError`.
- HMAC tokens remain kernel-scoped (regression test in
  `test_federation_discovery.py::test_kernel_scoped_hmac_isolation_for_imported_capability`).

Issue #68 — Tech debt
- D. Decompose `policy_dsl.py` (was 661 → 297 lines): parsing moved to
  `policy_dsl_parser.py` (277 lines), denial-explanation traversal to
  `policy_dsl_explain.py` (214). Public API surface unchanged —
  `PolicyMatch`, `PolicyRule`, `DeclarativePolicyEngine` re-exported.
  `RateLimiter` extracted from `policy.py` into `rate_limit.py`;
  `policy.py` re-exports backward-compatibly.
- E. Added dry-run regression tests for `HTTPDriver` and `MCPDriver` in
  `test_kernel.py` — pins the driver-agnostic short-circuit.

Mode B refactor
- `kernel.py` (was 581 → 705 after PR #81) split into the
  `agent_kernel.kernel` sub-package to honour AGENTS.md's ≤300-line
  budget. `Kernel` class lives in `kernel/__init__.py`; heavy method
  bodies delegate to `_invoke`, `_dry_run`, `_federation`, `_stream`.
  Public imports (`from agent_kernel import Kernel`,
  `from agent_kernel.kernel import Kernel`) are unchanged.

Tests: 482 → 510 (+28). Coverage 96%. `make ci` clean: ruff format
unchanged, ruff check passes, mypy strict on 40 source files, examples
all run. New test files: `test_otel.py` (4), `test_firewall_stream.py`
(5), `test_federation_discovery.py` (17); extended `test_kernel.py`
with two HTTP/MCP dry-run tests.

Closes #38, #47, #49, #51, #68
Copilot AI review requested due to automatic review settings May 21, 2026 11:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly expands agent-kernel’s discovery and federation surface area by introducing namespaced capability discovery (with deferred namespace loaders and BM25-style ranking), capability manifests + local import, HTTP-based federated discovery with signed manifests, plus new streaming invocation/firewall plumbing and optional OpenTelemetry instrumentation. It also performs a larger internal refactor to keep core modules under the AGENTS.md 300-line guideline by splitting kernel.py, policy_dsl.py, and rate limiting into smaller modules.

Changes:

  • Add CapabilityRegistry namespace operations, deferred namespace loading, and BM25-flavoured ranked search with pagination.
  • Introduce capability federation: JSON-serializable manifests/descriptors, local manifest import, and federated discovery with optional HMAC signing + rate limiting.
  • Add streaming invocation (Kernel.invoke_stream, StreamingDriver, per-chunk firewalling) and optional OpenTelemetry instrumentation; refactor kernel/policy modules into submodules to meet module-size constraints.

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
tests/test_registry.py Adds tests for namespaces, deferred loaders, BM25 search, pagination, and a performance sanity check.
tests/test_otel.py Adds tests for OpenTelemetry instrumentation behavior and idempotency.
tests/test_kernel.py Adds regression tests ensuring dry-run short-circuits before driver dispatch for HTTP/MCP drivers.
tests/test_firewall_stream.py Adds tests for per-chunk firewall redaction and Kernel.invoke_stream semantics.
tests/test_federation.py Adds manifest round-trip, import, trust policy, and kernel-scoped token isolation tests.
tests/test_federation_discovery.py Adds tests for signed manifests, discovery via peer/registry URLs, and rate limiting.
src/agent_kernel/registry.py Implements namespaces, deferred namespace loaders, BM25-style scoring, and offset pagination.
src/agent_kernel/rate_limit.py Extracts rate limiting into a dedicated module.
src/agent_kernel/policy.py Uses extracted rate limiter and preserves backwards-compatible aliases.
src/agent_kernel/policy_dsl.py Refactors declarative policy engine to delegate parsing/explanation to new modules; re-exports types.
src/agent_kernel/policy_dsl_parser.py New module containing policy DSL schema + YAML/TOML loaders + parsing.
src/agent_kernel/policy_dsl_explain.py New module for denial explanation traversal logic.
src/agent_kernel/otel.py Adds optional OpenTelemetry instrumentation wrapper for Kernel methods.
src/agent_kernel/models.py Adds NamespaceMetadata, CapabilityDescriptor, CapabilityManifest, and Frame.is_final.
src/agent_kernel/kernel/_stream.py Implements streaming invocation pipeline and trace recording for streams.
src/agent_kernel/kernel/_invoke.py Extracts non-dry-run invoke pipeline and shared helpers (mode resolution, tracing).
src/agent_kernel/kernel/_federation.py Implements Kernel federation/discovery method bodies.
src/agent_kernel/kernel/_dry_run.py Extracts dry-run result construction logic.
src/agent_kernel/kernel/init.py Converts Kernel into a package and wires new invoke/stream/federation helpers.
src/agent_kernel/kernel.py Removes the monolithic kernel module in favor of the new package structure.
src/agent_kernel/firewall/transform.py Adds Firewall.apply_stream() to firewall chunks incrementally.
src/agent_kernel/federation.py Adds manifest building/importing and trust-policy handling.
src/agent_kernel/federation_discovery.py Adds signing/verification and HTTP discovery with rate limiting.
src/agent_kernel/errors.py Adds namespace and federation/discovery error types.
src/agent_kernel/drivers/base.py Adds the runtime-checkable StreamingDriver protocol.
src/agent_kernel/init.py Re-exports new federation/discovery/OTel/streaming symbols and errors.
pyproject.toml Adds mypy ignore override for opentelemetry.*.
docs/integrations.md Documents OpenTelemetry integration usage and emitted telemetry.
docs/federation.md Adds federation docs; now includes part 2 (discovery) content.
docs/context_firewall.md Documents streaming behavior and OTel observability hooks.
docs/capabilities.md Documents namespaces, deferred loading, and BM25 search behavior.
CHANGELOG.md Adds [Unreleased] entries describing the new features and refactors.

NamespaceNotFound: If no declared namespace or registered capability
lives under *prefix*.
"""
self._maybe_load_namespace(prefix)
Comment on lines 287 to 293
scored.append((score, cap))

scored.sort(key=lambda x: (-x[0], x[1].capability_id))

if offset:
scored = scored[offset:]
return [
Comment on lines +370 to +380
def _maybe_load_namespace(self, prefix: str) -> None:
"""Invoke the deferred loader for *prefix* if it has not run yet."""
meta = self._namespaces.get(prefix)
if meta is None or meta.loaded or meta.loader is None:
return
loader = meta.loader
# Mark as loaded *before* calling so a recursive load doesn't re-enter.
meta.loaded = True
for cap in loader():
self.register(cap)

Comment on lines +48 to +52
def _hash_payload(payload: bytes) -> bytes:
"""Return the SHA-256 digest of *payload* for signing."""
return hashlib.sha256(payload).digest()


Comment on lines +106 to +117
payload_bytes = envelope["payload"].encode("utf-8")
expected_sig = hmac.new(secret.encode("utf-8"), payload_bytes, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected_sig, envelope["signature"]):
raise ManifestSignatureError(
"Manifest signature mismatch — payload may be tampered, or the "
"verification secret does not match the publisher's signing secret."
)

payload_data = json.loads(envelope["payload"])
return CapabilityManifest.from_dict(payload_data)


Comment on lines +602 to +607
return cls(
kernel_id=data["kernel_id"],
version=data["version"],
endpoint=data["endpoint"],
trust_level=data.get("trust_level", "unverified"),
capabilities=[CapabilityDescriptor.from_dict(c) for c in data["capabilities"]],
Comment thread src/agent_kernel/otel.py
Comment on lines +4 to +5
``invoke``, ``invoke_stream``, ``grant_capability``, ``expand``,
``advertise``, and ``import_remote`` methods with OTel spans + metric
Comment thread docs/federation.md
Comment on lines +4 to +7
> format & local registry) is implemented here. Discovery over a network
> (issue [#51](https://github.com/dgenio/agent-kernel/issues/51)) is **not**
> part of this milestone — `agent-kernel` does not fetch manifests over
> HTTP or sign them on your behalf yet. Bring your own transport for now.
Comment thread tests/test_registry.py
Comment on lines +299 to +305
start = time.perf_counter()
results = reg.search("billing thing", max_results=10)
elapsed = time.perf_counter() - start
assert len(results) == 10
# Generous bound: BM25 over 500 docs with ~5 tokens each should be
# well under a second on any developer machine.
assert elapsed < 1.0, f"search took {elapsed:.3f}s for 500 capabilities"
Comment thread tests/test_otel.py
Comment on lines +35 to +46
if not OTEL_AVAILABLE: # pragma: no cover - skipped without the [otel] extra
pytest.skip(
"opentelemetry-api not installed; install the [otel] extra to run.",
allow_module_level=True,
)

from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import InMemoryMetricReader
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Capability marketplace part 1: manifest format & local registry Capability namespaces & hierarchical discovery

3 participants