Skip to content

feat: add Windows support#44

Open
agent-of-mkmeral wants to merge 2 commits into
strands-agents:mainfrom
agent-of-mkmeral:feat/windows-support
Open

feat: add Windows support#44
agent-of-mkmeral wants to merge 2 commits into
strands-agents:mainfrom
agent-of-mkmeral:feat/windows-support

Conversation

@agent-of-mkmeral

Copy link
Copy Markdown
Contributor

Summary

Makes Strands Shell build, test, and ship on Windows. Closes the gap identified in the linked research: the crate didn't compile on Windows out of the box.

The only non-portable code in the library was a single Linux-only TOCTOU check in the host-file read path. Everything else was already cleanly #[cfg]-gated.

Root cause

src/vfs_kernel.rs::open_host() (read branch) unconditionally used std::os::unix::io::AsRawFd + /proc/self/fd/<fd> to re-verify, after open(), that the opened fd still pointed inside the bind mount. This:

  • doesn't exist on Windows (std::os::unix is absent → E0433/E0599), and
  • never actually functioned on macOS either (no /proc), it just happened to compile there because macOS has the std::os::unix module.

Changes

Core fix — src/vfs_kernel.rs

  • Gate the /proc/self/fd defense-in-depth TOCTOU verification behind #[cfg(target_os = "linux")]. This is purely a Linux procfs interface, so the check only ever ran on Linux. The primary security boundary is unchanged: the canonical-path containment check performed before open_host() still runs on every platform. This narrows the platform of an extra defense layer rather than weakening Linux's guarantee.
  • canon_base is now only consumed by that Linux-only block, so it's explicitly marked unused on other targets (no warnings).

Tests — tests/shell_integration.rs

  • Gate bind_direct_symlink_escape_blocked and bind_direct_dangling_symlink_blocked behind #[cfg(unix)]. They call std::os::unix::fs::symlink, which doesn't exist on Windows (and symlink creation there needs elevated privileges). They continue to run on Linux/macOS.

CI — .github/workflows/ci.yml

  • Add windows-latest to the rust, python, and node matrices.
  • Make the python job cross-platform: build the venv and add its bin/Scripts dir to GITHUB_PATH instead of hardcoding .venv/bin.
  • On Windows, set npm script-shell to bash so package.json's $(npm run --silent host-triple) command substitution works (npm defaults to cmd.exe there).

CD — .github/workflows/release.yml, package.json

  • Add x86_64-pc-windows-msvc to: the Python wheel matrix, the Node addon matrix, napi.targets, and the publish-platform matrix (new strands-agents-shell-win32-x64-msvc package).
  • Update the inspect coverage check (win_amd64 wheel) and bump the npm pack count guard 5 → 6.

Validation

  • ✅ Cross-compiled lib + bins + all tests to x86_64-pc-windows-gnu — clean, no warnings.
  • ✅ Cross-checked the python feature for Windows via PYO3_CROSS_PYTHON_VERSION.
  • ✅ Full suite green on Linux: 1296 passed, including both #[cfg(unix)]-gated symlink escape tests.
  • ⏳ The new windows-latest CI legs (native MSVC build of Rust/Python/Node) will be the authoritative cross-platform proof once this PR's CI runs.

Note: the maintainer asked me to validate on a real Windows host via AWS EC2, but this CI role has no ec2:* permissions. GitHub's free windows-latest runners are the equivalent (and cheaper) native validation path, which is why I wired them into CI rather than provisioning an instance. Happy to do a manual EC2 run if you'd prefer — just need EC2 access.


cc @mkmeral — opened as a draft for your review.

Strands Shell did not compile on Windows because src/vfs_kernel.rs used an
unconditional Linux-only TOCTOU check (std::os::unix::io::AsRawFd plus
/proc/self/fd) in the host-file read path, and two integration tests created
symlinks via std::os::unix::fs::symlink.

Core fix (src/vfs_kernel.rs):
- Gate the /proc/self/fd defense-in-depth TOCTOU verification behind
  #[cfg(target_os = "linux")]. /proc/self/fd is a Linux-only procfs
  interface (macOS has no /proc; Windows has no procfs), so the check only
  ever functioned on Linux. The canonical-path check performed before
  open_host() remains the primary security boundary on all platforms, so
  this narrows the platform of an extra layer rather than weakening the
  guarantee on Linux.
- Silence the now-conditionally-unused canon_base arg on non-Linux targets.

Tests (tests/shell_integration.rs):
- Gate bind_direct_symlink_escape_blocked and
  bind_direct_dangling_symlink_blocked behind #[cfg(unix)] since
  std::os::unix::fs::symlink is unavailable on Windows (symlink creation
  there also requires elevated privileges).

CI (.github/workflows/ci.yml):
- Add windows-latest to the rust, python, and node test matrices.
- Make the python job cross-platform: create the venv and prepend its
  bin/Scripts dir to GITHUB_PATH instead of hardcoding .venv/bin.
- On Windows, set npm script-shell to bash so package.json's
  $(npm run --silent host-triple) command substitution works.

CD (.github/workflows/release.yml, package.json):
- Build + publish Windows artifacts: x86_64-pc-windows-msvc Python wheel,
  node addon, and the strands-agents-shell-win32-x64-msvc npm package.
- Update the inspect coverage check (win_amd64 wheel) and the npm pack
  count guard (5 -> 6 packages).

Validated by cross-compiling lib, bins and all tests to
x86_64-pc-windows-gnu (clean), cross-checking the python feature with
PYO3_CROSS_PYTHON_VERSION, and running the full 1296-test suite on Linux.
@agent-of-mkmeral agent-of-mkmeral marked this pull request as ready for review June 24, 2026 22:27
@agent-of-mkmeral

Copy link
Copy Markdown
Contributor Author

Review Loop Summary

This PR passed the pre-PR fresh-context review gate (task-pr-gate Mode A). A separate GitHub Actions process with fresh model state cloned the branch cold, diffed against origin/main, read every changed file, and re-derived the reasoning independently rather than trusting the implementor's claims.

Iteration Reviewer Result
1 fresh-context reviewer (run 28132814098) ✅ Clean — 0 🔴, 0 🟡, 2 🟢 (filed inline) → converged, marked ready

What was independently verified

1. Core fix (src/vfs_kernel.rs) — confirmed correct.
The primary security boundary is resolve_host(), which runs on every platform before open_host(): it canonicalize()s the resolved host path (fully resolving symlinks and ..) and enforces canon_host.starts_with(canon_base), returning None (access denied) on any escape — plus dangling-symlink and non-existent-parent containment. That guard is unchanged. The /proc/self/fd block is a secondary TOCTOU re-check; gating it behind #[cfg(target_os = "linux")] is sound because:

  • It is a Linux-only procfs interface. On macOS it always silently no-op'd (read_link("/proc/...")Err, skipped) — so no behavioral change there.
  • On Windows it never compiled, so this is a net-new platform with the primary guard fully intact — no path-traversal/symlink-escape regression is introduced, because the layer being skipped never ran on non-Linux anyway.
  • Linux's security posture is byte-for-byte identical (the check body is unchanged, merely scoped).

2. Test gating — confirmed correct.
Both bind_direct_symlink_escape_blocked and bind_direct_dangling_symlink_blocked use std::os::unix::fs::symlink (absent on Windows), so #[cfg(unix)] is the right gate. Windows still runs the rest of the integration suite, including the non-symlink containment / ..-traversal tests (canonicalize_dotdot, normalize_dotdot, cd_dotdot, bind_direct_*) — VFS containment remains exercised on Windows.

3. CI/CD — confirmed correct. windows-latest added to rust/python/node matrices (Rust toolchain installed in all three, required for maturin/napi); venv bin/Scripts selected via RUNNER_OS; npm script-shell bash correctly enables $(...) substitution; release wiring (win32-x64-msvc wheel + node addon + npm platform package, win_amd64 coverage check, pack-count 5→6) is internally consistent. The CI Gate aggregator keys on needs.*.result, so adding matrix legs doesn't break required-check naming.

4. Readiness re-run cold (Linux + Windows cross):

  • cargo fmt --check
  • cargo test --workspace --all-targets1749 passed, 0 failed
  • cargo doc --no-deps with -D warnings
  • cargo clippy --workspace --all-targets — no new findings on touched code ✅
  • cargo check --target x86_64-pc-windows-gnu --all-targets0 errors
  • PYO3_CROSS_PYTHON_VERSION=3.12 cargo check --features python (windows target) ✅

(Note: a full windows-gnu DLL link fails locally with mingw ld's export ordinal too large — a well-known GNU-linker cdylib limitation, not a code defect. The release/CI path uses the MSVC linker, which has no such limit; cargo check is the authoritative compile signal and is clean.)

Remaining findings

Two 🟢 nice-to-have doc-comment nits in release.yml filed as inline comments below. Both are stale prose only — the enforced values (expected=6, the error string) are already correct, so they have zero functional impact.


Automated pre-PR review gate · iteration 1 of max 3 · converged on a clean review.

run: |
set -euo pipefail
expected=5
expected=6

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 nice-to-have (no functional impact — the enforced expected=6 here and the error string are correct). Two doc-comment lines in this file are now stale after adding Windows and could be tidied:

  • Line ~354-355: static matrix of the 4 platform packages (+ the main package = 5 total) → should read 5 platform packages (6 total).
  • Line ~382: Publish the 4 per-platform packages → now 5 per-platform packages.

Purely prose; the actual matrices and the expected=6 guard already reflect the win32-x64-msvc addition. Anchored here because those comment lines are unchanged context and not directly annotatable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f26bd58 — updated all the stale prose counts (4→5 platform packages, 5→6 total) in release.yml. The enforced expected=6 guard and the matrices were already correct, so this was prose-only.

mkmeral
mkmeral previously approved these changes Jun 25, 2026
chaynabors
chaynabors previously approved these changes Jun 25, 2026
Comment thread .github/workflows/ci.yml
matrix:
os: [ubuntu-latest, macos-latest]
os: [ubuntu-latest, macos-latest, windows-latest]
python: ["3.10", "3.11", "3.12", "3.13", "3.14"]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking, since this is a rust project the whole CI/CD including maturin could be rust.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed in principle — a Rust-driven CI/CD (e.g. an xtask that wraps maturin/napi) would be cleaner and more consistent. This PR keeps the existing shell-based matrix to stay scoped to Windows enablement, but it's a good follow-up; happy to file an issue for a CI-in-Rust pass if you'd like.

Comment thread src/vfs_kernel.rs
if let Ok(real) = std::fs::read_link(&fd_path)
&& !real.starts_with(canon_base)

// Defense-in-depth TOCTOU check: after opening, verify via

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are big comments like these common across the project?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vfs_kernel.rs already leans verbose (≈130 comment lines) since AGENTS.md flags it as security-critical and asks contributors to preserve the reasoning behind each guard. This block explains why the TOCTOU re-check is Linux-only — which isn't obvious from the #[cfg(target_os = "linux")] alone and is exactly the kind of security rationale that file documents. That said, if you'd prefer it trimmed to a one-liner I'm glad to tighten it — your call as maintainer.

Comment thread src/vfs_kernel.rs
));
}
}
// `canon_base` is only consumed by the Linux-only check above; on

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is self explanatory to a rust engineer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair — the let _ = canon_base; is self-explanatory to a Rust dev. I can drop the explanatory line above it if you prefer. Will fold this into the trim if you want the comments tightened.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the same function but you used an import differently and blocked. A strategy for conditional compilation around the unix import and symlink and cargo env check around the blocking behaviour might reduce this to one function, which could then be inlined ~possibly. Only if it improves readability.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore that, meant to comment elsewhere.


// Uses std::os::unix::fs::symlink — symlink creation differs on Windows
// (requires elevated privileges), so this escape test is Unix-only.
#[cfg(unix)]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not need conditional compilation. It's often more reliable to check cargo envvars at compile time since linting works on all platforms. Worth double checking.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good instinct, but here the gate has to be a compile-time #[cfg] rather than a runtime cfg!(...): the test body calls std::os::unix::fs::symlink, which does not exist in the standard library on Windows. A runtime cfg!(unix) (or a cargo-env check) still compiles the call on every target, so the test would fail to build on windows-msvc with cannot find function symlink in os::unix. #[cfg(unix)] excludes it at compile time, which is what we need. Windows keeps the non-symlink containment / ..-traversal coverage (canonicalize_dotdot, cd_dotdot, the other bind_direct_* tests). Happy to revisit if there is a portable symlink helper you would prefer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the same function but you used an import differently and blocked. A strategy for conditional compilation around the unix import and symlink and cargo env check around the blocking behaviour might reduce this to one function, which could then be inlined ~possibly. Only if it improves readability.


// Uses std::os::unix::fs::symlink — symlink creation differs on Windows
// (requires elevated privileges), so this escape test is Unix-only.
#[cfg(unix)]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as the other symlink test — std::os::unix::fs::symlink is absent on Windows, so this needs the compile-time #[cfg(unix)] to build at all.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before.

Two native-Windows-only failures that the windows-gnu cross-compile gate
could not catch (cross-compile never runs tests nor builds the Python
module):

- tests/shell_integration.rs: the config_file_with_binds_and_creds test
  interpolated a Windows temp path (C:\Users\...) into a TOML *basic*
  string, where \U/\A are parsed as invalid escape sequences -> TOML
  parse error. Switched to a TOML *literal* string (single quotes), which
  does no escape processing; Windows paths never contain single quotes.

- .github/workflows/ci.yml: on Windows the venv Scripts dir was added to
  GITHUB_PATH using the Git Bash $PWD (an MSYS path like /d/a/shell/shell)
  that the Windows PATH cannot resolve, so the venv was silently ignored
  and pytest ran from the host interpreter -> ModuleNotFoundError:
  strands_shell. Convert with cygpath -w so .venv\Scripts is actually used.

Also refresh stale doc-comment counts in release.yml (4->5 platform
packages, 5->6 total) now that win32-x64-msvc is included; the enforced
expected=6 guard and matrices were already correct.

Verified locally: cargo test --workspace --all-targets (1296+ pass),
maturin develop + pytest tests/python (44 pass), cargo fmt/clippy/doc.
@agent-of-mkmeral agent-of-mkmeral dismissed stale reviews from chaynabors and mkmeral via f26bd58 June 25, 2026 19:09
@agent-of-mkmeral

Copy link
Copy Markdown
Contributor Author

🔴 Fixed two real native-Windows CI failures (+ addressed review nits)

TL;DR: CI was red on the native windows-latest Rust and Python legs. Both were genuine bugs that the windows-gnu cross-compile gate structurally could not catch (cross-compile never runs tests, and never builds the Python module). Root-caused both, fixed, and verified the full readiness suite locally. Pushed f26bd58. Also folded in the maintainer doc-comment nits.

Root causes & fixes

1. Rust — config_file_with_binds_and_creds (TOML parse error)

TOML parse error at line 6, column 15
source = "C:\Users\RUNNER~1\AppData\Local\Temp\lsh_config_full_test\src"
too few unicode value digits, expected unicode hexadecimal value

The test interpolated a Windows temp path into a TOML basic string, where \U, \A, \R… are parsed as escape sequences → invalid. Fix: use a TOML literal string (source = '{}'). Literal strings do no escape processing, and Windows paths never contain a single quote, so it's safe and platform-agnostic.

2. Python — ModuleNotFoundError: No module named 'strands_shell'

The pytest banner showed it running from C:\hostedtoolcache\…\python.exe — the host interpreter, not .venv\Scripts\python.exe. maturin develop installed into .venv, but pytest didn't see it. Cause: the venv step did echo "$PWD/.venv/Scripts" >> $GITHUB_PATH from Git Bash, where $PWD is an MSYS path (/d/a/shell/shell) that the Windows PATH can't resolve → the entry was silently dropped → pytest fell back to host Python. Fix: cygpath -w "$PWD/.venv/Scripts" to emit a native Windows path.

3. Doc-comment nits (from @chaynabors review thread)

Refreshed the stale release.yml counts now that win32-x64-msvc is in: 4→5 platform packages and 5→6 total. The enforced expected=6 guard and the matrices were already correct — prose-only.

Local verification (full readiness, before push)
  • cargo fmt --check
  • cargo test --workspace --all-targets1296 + aux suites passed, 0 failed ✅ (incl. the previously-failing config_file_with_binds_and_creds)
  • cargo clippy --workspace --all-targets — no new findings ✅
  • RUSTDOCFLAGS=-D warnings cargo doc --workspace --no-deps
  • maturin develop --release + pytest tests/python44 passed ✅ (reproduced the venv-on-PATH issue and confirmed the import works once the venv is correctly active)

Note: the new CI run is in action_required (fork PR) and needs a maintainer to approve the workflow before it runs — my token can't self-approve fork runs. Once approved it should go green on all legs.

On the open review threads from @chaynabors (comment verbosity in vfs_kernel.rs, cfg!(windows) vs #[cfg(unix)] for the symlink tests, maturin-in-Rust): those are style/architecture discussion points rather than blockers — happy to act on whichever you'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants