feat(registry,federation): namespaces, BM25 search, capability manifests#82
Open
dgenio wants to merge 2 commits into
Open
feat(registry,federation): namespaces, BM25 search, capability manifests#82dgenio wants to merge 2 commits into
dgenio wants to merge 2 commits into
Conversation
Implements the capability-discovery group from triage: #45 (namespaces & hierarchical discovery) + #52 (marketplace part 1 — manifest format & local registry). Both extend the same discovery seam and share models.py. #45 — CapabilityRegistry - Dot-notation `capability_id`s now expose `list_namespaces()` and `list_namespace(prefix)`. Flat IDs continue to work unchanged. - `register_namespace(prefix, loader=...)` enables deferred registration for large tool ecosystems; the loader runs at most once on first access (search/list/get). - `search()` gained an `offset` kwarg, strips a small stop-word set, and now scores with a BM25-flavoured ranker that weights matches on `capability_id` and `tags` above `description`. Determinism preserved via stable `capability_id` tie-breaking — required by AGENTS.md. #52 — Capability marketplace (local) - New `CapabilityDescriptor` and `CapabilityManifest` dataclasses with `to_dict`/`from_dict` for JSON portability. Internal driver IDs and operation names are stripped on serialisation. - New `agent_kernel.federation` module: `build_manifest()`, `import_manifest()`, `merge_sensitivity()`. Three trust policies honoured at import time (`most_restrictive` default, `local_only`, `remote_deferred`). - `Kernel.advertise(endpoint=...)` and `Kernel.import_remote(manifest, driver=..., trust_policy=...)` thin wrappers; `Kernel` gained a `kernel_id` kwarg used as the publisher identity (defaults to `"agent-kernel"`). - Imported capabilities flow through the full local pipeline (policy → token → firewall → trace). HMAC tokens remain kernel-scoped — a token from a kernel with a different secret fails signature check. New errors in `errors.py`: `NamespaceNotFound`, `FederationError`, `ManifestError`, `TrustPolicyError`. Tests: +32 net (450 → 482). New `tests/test_federation.py` covers manifest round-trip, descriptor stripping, all three trust policies, end-to-end imported-capability invocation, and kernel-scoped HMAC isolation. `tests/test_registry.py` extended with namespace ops, deferred-loader semantics, BM25 ranking, pagination, stop-word stripping, and a 500-capability scale sanity check. Docs: new `docs/federation.md`, namespace section added to `docs/capabilities.md`, CHANGELOG `[Unreleased]` entries. Out of scope (left for follow-ups, per the triage report): - Discovery over a network transport (#51, marketplace part 2). - Decomposition of `policy_dsl.py` / `kernel.py` / `models.py` beyond AGENTS.md's 300-line budget (#68). Closes #45 Closes #52 https://claude.ai/code/session_015PFMci84T2TBNhqKsV3Hyo
…_dsl decomposition Implements the five remaining triage-report issues in one PR, building on top of PR #81's registry/federation foundation (which closes #45 + #52 and is included transitively here). Issue #38 — OpenTelemetry integration - New `agent_kernel.otel` module: `instrument_kernel(kernel)` wraps `Kernel.invoke` and `Kernel.grant_capability` with OTel spans (`agent_kernel.invoke`, `agent_kernel.grant`) and metrics (`agent_kernel.invocations`, `agent_kernel.invocation_duration`, `agent_kernel.policy_denials`). Idempotent, instance-scoped. - No-op (zero runtime cost) when the `[otel]` extra is not installed; `OTEL_AVAILABLE` exposes the runtime status. - Optional `tracer_provider` / `meter_provider` kwargs let tests bypass the global-singleton lock. Issue #47 — Streaming firewall - New `Firewall.apply_stream()` async iterator processes chunks one-at-a-time with per-chunk PII redaction. - New `StreamingDriver` Protocol in `drivers/base.py` (runtime-checkable) adds optional `execute_stream()` to the driver surface. - New `Kernel.invoke_stream()` yields `Frame` chunks; the last has `is_final=True`. Non-streaming drivers automatically fall back to a single-chunk stream via `execute()`. - `Frame` gained `is_final: bool` (default `False`); single-shot `Kernel.invoke` returns `Frame(..., is_final=True)`. Issue #51 — Federated discovery (closes #49) - New `agent_kernel.federation_discovery` module: `discover_peers(peer_urls=..., registry_url=...)` fetches manifests over HTTP, `sign_manifest` / `verify_manifest` for HMAC-SHA256 signed envelopes, `serve_manifest_payload` as a framework-agnostic helper for manifest-serving endpoints, `DiscoveryRateLimiter` (default 10/60s). - New `Kernel.discover_peers()` method. - Asymmetric signing strictness: passing a `secret` *requires* signed manifests; omitting it *rejects* signed envelopes — no silent downgrade attack surface. - New errors `ManifestSignatureError` and `DiscoveryError`. - HMAC tokens remain kernel-scoped (regression test in `test_federation_discovery.py::test_kernel_scoped_hmac_isolation_for_imported_capability`). Issue #68 — Tech debt - D. Decompose `policy_dsl.py` (was 661 → 297 lines): parsing moved to `policy_dsl_parser.py` (277 lines), denial-explanation traversal to `policy_dsl_explain.py` (214). Public API surface unchanged — `PolicyMatch`, `PolicyRule`, `DeclarativePolicyEngine` re-exported. `RateLimiter` extracted from `policy.py` into `rate_limit.py`; `policy.py` re-exports backward-compatibly. - E. Added dry-run regression tests for `HTTPDriver` and `MCPDriver` in `test_kernel.py` — pins the driver-agnostic short-circuit. Mode B refactor - `kernel.py` (was 581 → 705 after PR #81) split into the `agent_kernel.kernel` sub-package to honour AGENTS.md's ≤300-line budget. `Kernel` class lives in `kernel/__init__.py`; heavy method bodies delegate to `_invoke`, `_dry_run`, `_federation`, `_stream`. Public imports (`from agent_kernel import Kernel`, `from agent_kernel.kernel import Kernel`) are unchanged. Tests: 482 → 510 (+28). Coverage 96%. `make ci` clean: ruff format unchanged, ruff check passes, mypy strict on 40 source files, examples all run. New test files: `test_otel.py` (4), `test_firewall_stream.py` (5), `test_federation_discovery.py` (17); extended `test_kernel.py` with two HTTP/MCP dry-run tests. Closes #38, #47, #49, #51, #68
There was a problem hiding this comment.
Pull request overview
This PR significantly expands agent-kernel’s discovery and federation surface area by introducing namespaced capability discovery (with deferred namespace loaders and BM25-style ranking), capability manifests + local import, HTTP-based federated discovery with signed manifests, plus new streaming invocation/firewall plumbing and optional OpenTelemetry instrumentation. It also performs a larger internal refactor to keep core modules under the AGENTS.md 300-line guideline by splitting kernel.py, policy_dsl.py, and rate limiting into smaller modules.
Changes:
- Add
CapabilityRegistrynamespace operations, deferred namespace loading, and BM25-flavoured ranked search with pagination. - Introduce capability federation: JSON-serializable manifests/descriptors, local manifest import, and federated discovery with optional HMAC signing + rate limiting.
- Add streaming invocation (
Kernel.invoke_stream,StreamingDriver, per-chunk firewalling) and optional OpenTelemetry instrumentation; refactor kernel/policy modules into submodules to meet module-size constraints.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_registry.py | Adds tests for namespaces, deferred loaders, BM25 search, pagination, and a performance sanity check. |
| tests/test_otel.py | Adds tests for OpenTelemetry instrumentation behavior and idempotency. |
| tests/test_kernel.py | Adds regression tests ensuring dry-run short-circuits before driver dispatch for HTTP/MCP drivers. |
| tests/test_firewall_stream.py | Adds tests for per-chunk firewall redaction and Kernel.invoke_stream semantics. |
| tests/test_federation.py | Adds manifest round-trip, import, trust policy, and kernel-scoped token isolation tests. |
| tests/test_federation_discovery.py | Adds tests for signed manifests, discovery via peer/registry URLs, and rate limiting. |
| src/agent_kernel/registry.py | Implements namespaces, deferred namespace loaders, BM25-style scoring, and offset pagination. |
| src/agent_kernel/rate_limit.py | Extracts rate limiting into a dedicated module. |
| src/agent_kernel/policy.py | Uses extracted rate limiter and preserves backwards-compatible aliases. |
| src/agent_kernel/policy_dsl.py | Refactors declarative policy engine to delegate parsing/explanation to new modules; re-exports types. |
| src/agent_kernel/policy_dsl_parser.py | New module containing policy DSL schema + YAML/TOML loaders + parsing. |
| src/agent_kernel/policy_dsl_explain.py | New module for denial explanation traversal logic. |
| src/agent_kernel/otel.py | Adds optional OpenTelemetry instrumentation wrapper for Kernel methods. |
| src/agent_kernel/models.py | Adds NamespaceMetadata, CapabilityDescriptor, CapabilityManifest, and Frame.is_final. |
| src/agent_kernel/kernel/_stream.py | Implements streaming invocation pipeline and trace recording for streams. |
| src/agent_kernel/kernel/_invoke.py | Extracts non-dry-run invoke pipeline and shared helpers (mode resolution, tracing). |
| src/agent_kernel/kernel/_federation.py | Implements Kernel federation/discovery method bodies. |
| src/agent_kernel/kernel/_dry_run.py | Extracts dry-run result construction logic. |
| src/agent_kernel/kernel/init.py | Converts Kernel into a package and wires new invoke/stream/federation helpers. |
| src/agent_kernel/kernel.py | Removes the monolithic kernel module in favor of the new package structure. |
| src/agent_kernel/firewall/transform.py | Adds Firewall.apply_stream() to firewall chunks incrementally. |
| src/agent_kernel/federation.py | Adds manifest building/importing and trust-policy handling. |
| src/agent_kernel/federation_discovery.py | Adds signing/verification and HTTP discovery with rate limiting. |
| src/agent_kernel/errors.py | Adds namespace and federation/discovery error types. |
| src/agent_kernel/drivers/base.py | Adds the runtime-checkable StreamingDriver protocol. |
| src/agent_kernel/init.py | Re-exports new federation/discovery/OTel/streaming symbols and errors. |
| pyproject.toml | Adds mypy ignore override for opentelemetry.*. |
| docs/integrations.md | Documents OpenTelemetry integration usage and emitted telemetry. |
| docs/federation.md | Adds federation docs; now includes part 2 (discovery) content. |
| docs/context_firewall.md | Documents streaming behavior and OTel observability hooks. |
| docs/capabilities.md | Documents namespaces, deferred loading, and BM25 search behavior. |
| CHANGELOG.md | Adds [Unreleased] entries describing the new features and refactors. |
| NamespaceNotFound: If no declared namespace or registered capability | ||
| lives under *prefix*. | ||
| """ | ||
| self._maybe_load_namespace(prefix) |
Comment on lines
287
to
293
| scored.append((score, cap)) | ||
|
|
||
| scored.sort(key=lambda x: (-x[0], x[1].capability_id)) | ||
|
|
||
| if offset: | ||
| scored = scored[offset:] | ||
| return [ |
Comment on lines
+370
to
+380
| def _maybe_load_namespace(self, prefix: str) -> None: | ||
| """Invoke the deferred loader for *prefix* if it has not run yet.""" | ||
| meta = self._namespaces.get(prefix) | ||
| if meta is None or meta.loaded or meta.loader is None: | ||
| return | ||
| loader = meta.loader | ||
| # Mark as loaded *before* calling so a recursive load doesn't re-enter. | ||
| meta.loaded = True | ||
| for cap in loader(): | ||
| self.register(cap) | ||
|
|
Comment on lines
+48
to
+52
| def _hash_payload(payload: bytes) -> bytes: | ||
| """Return the SHA-256 digest of *payload* for signing.""" | ||
| return hashlib.sha256(payload).digest() | ||
|
|
||
|
|
Comment on lines
+106
to
+117
| payload_bytes = envelope["payload"].encode("utf-8") | ||
| expected_sig = hmac.new(secret.encode("utf-8"), payload_bytes, hashlib.sha256).hexdigest() | ||
| if not hmac.compare_digest(expected_sig, envelope["signature"]): | ||
| raise ManifestSignatureError( | ||
| "Manifest signature mismatch — payload may be tampered, or the " | ||
| "verification secret does not match the publisher's signing secret." | ||
| ) | ||
|
|
||
| payload_data = json.loads(envelope["payload"]) | ||
| return CapabilityManifest.from_dict(payload_data) | ||
|
|
||
|
|
Comment on lines
+602
to
+607
| return cls( | ||
| kernel_id=data["kernel_id"], | ||
| version=data["version"], | ||
| endpoint=data["endpoint"], | ||
| trust_level=data.get("trust_level", "unverified"), | ||
| capabilities=[CapabilityDescriptor.from_dict(c) for c in data["capabilities"]], |
Comment on lines
+4
to
+5
| ``invoke``, ``invoke_stream``, ``grant_capability``, ``expand``, | ||
| ``advertise``, and ``import_remote`` methods with OTel spans + metric |
Comment on lines
+4
to
+7
| > format & local registry) is implemented here. Discovery over a network | ||
| > (issue [#51](https://github.com/dgenio/agent-kernel/issues/51)) is **not** | ||
| > part of this milestone — `agent-kernel` does not fetch manifests over | ||
| > HTTP or sign them on your behalf yet. Bring your own transport for now. |
Comment on lines
+299
to
+305
| start = time.perf_counter() | ||
| results = reg.search("billing thing", max_results=10) | ||
| elapsed = time.perf_counter() - start | ||
| assert len(results) == 10 | ||
| # Generous bound: BM25 over 500 docs with ~5 tokens each should be | ||
| # well under a second on any developer machine. | ||
| assert elapsed < 1.0, f"search took {elapsed:.3f}s for 500 capabilities" |
Comment on lines
+35
to
+46
| if not OTEL_AVAILABLE: # pragma: no cover - skipped without the [otel] extra | ||
| pytest.skip( | ||
| "opentelemetry-api not installed; install the [otel] extra to run.", | ||
| allow_module_level=True, | ||
| ) | ||
|
|
||
| from opentelemetry.sdk.metrics import MeterProvider | ||
| from opentelemetry.sdk.metrics.export import InMemoryMetricReader | ||
| from opentelemetry.sdk.trace import TracerProvider | ||
| from opentelemetry.sdk.trace.export import SimpleSpanProcessor | ||
| from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the capability-discovery group from triage: #45 (namespaces &
hierarchical discovery) + #52 (marketplace part 1 — manifest format &
local registry). Both extend the same discovery seam and share models.py.
#45 — CapabilityRegistry
capability_ids now exposelist_namespaces()andlist_namespace(prefix). Flat IDs continue to work unchanged.register_namespace(prefix, loader=...)enables deferred registrationfor large tool ecosystems; the loader runs at most once on first
access (search/list/get).
search()gained anoffsetkwarg, strips a small stop-word set,and now scores with a BM25-flavoured ranker that weights matches on
capability_idandtagsabovedescription. Determinism preservedvia stable
capability_idtie-breaking — required by AGENTS.md.#52 — Capability marketplace (local)
CapabilityDescriptorandCapabilityManifestdataclasses withto_dict/from_dictfor JSON portability. Internal driver IDs andoperation names are stripped on serialisation.
agent_kernel.federationmodule:build_manifest(),import_manifest(),merge_sensitivity(). Three trust policieshonoured at import time (
most_restrictivedefault,local_only,remote_deferred).Kernel.advertise(endpoint=...)andKernel.import_remote(manifest, driver=..., trust_policy=...)thin wrappers;Kernelgained akernel_idkwarg used as the publisher identity (defaults to"agent-kernel").→ token → firewall → trace). HMAC tokens remain kernel-scoped — a
token from a kernel with a different secret fails signature check.
New errors in
errors.py:NamespaceNotFound,FederationError,ManifestError,TrustPolicyError.Tests: +32 net (450 → 482). New
tests/test_federation.pycoversmanifest round-trip, descriptor stripping, all three trust policies,
end-to-end imported-capability invocation, and kernel-scoped HMAC
isolation.
tests/test_registry.pyextended with namespace ops,deferred-loader semantics, BM25 ranking, pagination, stop-word
stripping, and a 500-capability scale sanity check.
Docs: new
docs/federation.md, namespace section added todocs/capabilities.md, CHANGELOG[Unreleased]entries.Out of scope (left for follow-ups, per the triage report):
policy_dsl.py/kernel.py/models.pybeyond AGENTS.md's 300-line budget ([policy/kernel] Tech debt: decompose policy_dsl.py and broaden dry-run driver test coverage #68).
Closes #45
Closes #52
https://claude.ai/code/session_015PFMci84T2TBNhqKsV3Hyo