Python: Add Hyperlight CodeAct package and docs#5185
Python: Add Hyperlight CodeAct package and docs#5185eavanvalkenburg wants to merge 39 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an optional Python Hyperlight-backed CodeAct implementation plus cross-SDK design documentation, and wires the new package into the Python workspace and CI.
Changes:
- Introduces the new
agent-framework-hyperlightalpha package (provider +execute_codetool), including samples and tests. - Updates
agent-framework-coreto let context providers inspect/override per-run runtime tools viaSessionContext.options["tools"]. - Adds ADR/design docs for CodeAct and updates Python CI workflows to include Hyperlight integration coverage.
Reviewed changes
Copilot reviewed 27 out of 28 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| python/uv.lock | Adds Hyperlight package + Hyperlight sandbox deps; updates a few dependency markers. |
| python/pyproject.toml | Registers agent-framework-hyperlight in the Python workspace. |
| python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py | Adds unit coverage + guarded real-sandbox integration test. |
| python/packages/hyperlight/samples/README.md | Documents how to run the new Hyperlight samples. |
| python/packages/hyperlight/samples/codeact_tool.py | Standalone HyperlightExecuteCodeTool sample. |
| python/packages/hyperlight/samples/codeact_context_provider.py | Provider-owned CodeAct sample using HyperlightCodeActProvider. |
| python/packages/hyperlight/README.md | Package-level README for installation and public API. |
| python/packages/hyperlight/pyproject.toml | New package metadata, deps, and tooling config. |
| python/packages/hyperlight/LICENSE | Adds MIT license for the new package. |
| python/packages/hyperlight/agent_framework_hyperlight/_types.py | Adds public types (FileMount, FilesystemMode, NetworkMode). |
| python/packages/hyperlight/agent_framework_hyperlight/_provider.py | Implements HyperlightCodeActProvider context provider. |
| python/packages/hyperlight/agent_framework_hyperlight/_instructions.py | Builds dynamic CodeAct instructions and tool descriptions. |
| python/packages/hyperlight/agent_framework_hyperlight/_execute_code_tool.py | Implements sandbox execution, caching, CRUD registries for tools/files/network. |
| python/packages/hyperlight/agent_framework_hyperlight/init.py | Exposes public API + version metadata. |
| python/packages/core/tests/core/test_agents.py | Adds tests validating providers can inspect/remove runtime tools. |
| python/packages/core/agent_framework/_tools.py | Introduces ApprovalMode type alias and updates signatures. |
| python/packages/core/agent_framework/_sessions.py | Updates docs to reflect provider mutability of options["tools"]. |
| python/packages/core/agent_framework/_agents.py | Passes runtime tools via SessionContext.options and resolves tools from provider-mutated options. |
| python/PACKAGE_STATUS.md | Adds agent-framework-hyperlight as alpha. |
| python/.cspell.json | Adds codeact and hyperlight to dictionary. |
| docs/features/code_act/python-implementation.md | Adds Python-specific CodeAct design notes and API contract. |
| docs/features/code_act/dotnet-implementation.md | Adds placeholder for .NET CodeAct implementation notes. |
| docs/decisions/0024-codeact-integration.md | Adds ADR covering cross-SDK CodeAct integration approach and approval model. |
| .github/workflows/python-merge-tests.yml | Includes Hyperlight tests in “misc integration” selection. |
| .github/workflows/python-integration-tests.yml | Includes Hyperlight tests in “misc integration” job. |
python/packages/hyperlight/agent_framework_hyperlight/_execute_code_tool.py
Outdated
Show resolved
Hide resolved
python/packages/hyperlight/agent_framework_hyperlight/_execute_code_tool.py
Show resolved
Hide resolved
a2f85f8 to
4c6f7da
Compare
moonbox3
left a comment
There was a problem hiding this comment.
This is a nice and complete ADR - well done. A lot here to unpack so doing a first pass with some questions.
b309856 to
fe24c4f
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dd05f42 to
54f1b3a
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enable the sandbox filesystem by providing a workspace_root so /output is mounted. Remove os.path.exists assertion (unsupported in WASM guest) and fix Content data assertion to use .uri. Skip the network integration test on Windows where the WASM sandbox lacks the encodings.idna codec. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
| ## Context and Problem Statement | ||
|
|
||
| We need an architecture design that supports CodeAct in both Python and .NET. This is a necessary capability for the current generation of long-running agents, which need to plan, iterate, transform tool outputs, and execute bounded code inside a controlled runtime instead of pushing every intermediate step back through the model. The design should preserve the same behavioral contract across SDKs, but it does not need to use the same internal extension point in each runtime. We also want to standardize on Hyperlight as the initial backend, using the existing Python package and an anticipated .NET binding package once it is available. |
There was a problem hiding this comment.
instead of pushing every intermediate step back through the model
Could we elaborate on what this means?
Also an introduction of CodeAct will greatly help readers of this doc.
| - Good, because a provider-owned CodeAct tool registry avoids mutating or inferring the agent's direct tool surface and can work consistently in both SDKs. | ||
| - Good, because the same conceptual design can remain open to `HyperlightCodeActProvider`, a future `MontyCodeActProvider`, and other backend-specific providers over time. | ||
| - Good, because `execute_code` can evolve into multiple backend-specific runtime modes rather than being hard-wired to one Python-plus-tools mode. | ||
| - Bad, because it is a bolt-on, which might make it less runtime efficient. |
There was a problem hiding this comment.
make it less runtime efficient
Why will a bolt-on make it less efficient?
|
|
||
| ## What is the problem being solved? | ||
|
|
||
| - Today, the easiest way to prototype CodeAct is to infer or reshape the agent's direct tool surface, which is fragile and hard to reason about. |
There was a problem hiding this comment.
Should this be part of the ADR instead?
| - snapshotting the current CodeAct-managed tool registry and capability settings for the run, | ||
| - computing the effective approval requirement for `execute_code` from the provider default and the snapshotted tool registry, | ||
| - adding a short CodeAct guidance block, | ||
| - adding `execute_code` to the run through `SessionContext.extend_tools(...)`, | ||
| - and wiring any backend-specific execution state needed for the run. |
There was a problem hiding this comment.
Are these required for each run? Can these be done once at construction time which will inject the available tools to the agent's tool list?
| client=client, | ||
| name="assistant", | ||
| tools=[send_email], # direct-only tool | ||
| context_providers=[codeact], |
There was a problem hiding this comment.
Looking at this, a question that users may have is that is the difference between tools and contexts?
Just an idea: is it possible to do the following
agent = Agent(
client=client,
name="assistant",
tools=[send_email, *codeact.get_tools()],
)where the returned tools have a reference to the provider so that that can access the file mounts, allowed domains, etc?
| agent = Agent( | ||
| client=client, | ||
| name="interpreter", | ||
| context_providers=[code_interpreter], |
There was a problem hiding this comment.
Adding to the previous comment, code_interpreter is reusing an existing concept whose usage is very different.
| - `codeact_context_provider.py` shows the provider-owned CodeAct model where the | ||
| agent only sees `execute_code` and sandbox tools are owned by | ||
| `HyperlightCodeActProvider`. | ||
| - `codeact_tool.py` shows the standalone `HyperlightExecuteCodeTool` surface | ||
| where `execute_code` is added directly to the agent tool list. |
There was a problem hiding this comment.
A short paragraph on when to use what will be helpful for customers.
Motivation and Context
Add a concrete, optional CodeAct implementation for Python and capture the cross-SDK design for CodeAct with Hyperlight. This provides a reusable path for long-running agents to execute sandboxed code with provider-owned tools, file mounts, and network allow-lists without baking CodeAct into core.
Description
agent-framework-hyperlightpackage withHyperlightCodeActProviderandHyperlightExecuteCodeToolCloses: #5187
Contribution Checklist