Conversation
Converts browser-use (Python) automation to Stagehand v3 (TypeScript) on Browserbase, choosing the right level of determinism per step rather than a one-to-one agentic copy. - SKILL.md: detect the browser-use variant, decompose the task across the determinism spectrum, emit Stagehand v3 + a migration summary - references/: full API mapping, determinism decision framework, and an optional trace-assisted path (pairs with the browser-trace skill) - GUIDE.md: human migration guide (philosophy, feature mapping, determinism) - PROMPT.md: tool-agnostic docs prompt (works in any AI assistant) - EXAMPLES.md: before/after pairs - README.md (skill): index; root README.md: add to the skills table Targets Stagehand v3 (verified against live docs); validated by running the skill on fresh scripts a clean agent had never seen. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Express determinism purely through Stagehand primitives — cached observe()→act() plus selfHeal/cacheDir/serverCache — instead of falling back to Playwright selectors. Drops all "Playwright" references and page.locator()/page.fill() calls across SKILL.md, GUIDE.md, PROMPT.md, README.md, and the determinism / trace-assisted references. page.goto/page.url stay (Stagehand's own page methods). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Standards pass (see #36 / the conventions all merged skills follow):
A CONTRIBUTING.md + CI validator for these checks is landing shortly. |
Per @shrey150's standards pass: - add LICENSE.txt (MIT, verbatim from skills/browser) - move GUIDE.md -> references/guide.md and PROMPT.md -> references/prompt.md, both linked from SKILL.md so the content isn't stranded - drop the skill-local README.md - fix relative links in references/guide.md and EXAMPLES.md Skill dir now follows the convention: SKILL.md + EXAMPLES.md + references/ + LICENSE.txt. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Validated by converting 9 real browser-use examples through the skill and
running the generated Stagehand on live Browserbase. Fixes 3 runtime failures
that typecheck clean but broke at runtime:
- Page settling: networkidle -> domcontentloaded. networkidle never fires on
Google/analytics/SPA pages and throws a 15s timeout (broke 2 of 9 cases).
- Agent output schema: agent().execute({ output }) throws
ExperimentalNotConfiguredError unless experimental:true, and output must be a
zod object (not a top-level array). Document both, and recommend the
agent-then-extract pattern to stay on the managed API path (broke 1 case).
- setViewportSize takes positional (width, height), not Playwright's
{ width, height } object (would not compile).
- Variant table: ChatBrowserUse is browser-use's hosted-model class and appears
in stable 0.13.x code -- it is NOT a Rust-beta tell. Only a literal
browser_use.beta import is. Prevents variant mis-detection.
- Add an "iterate an extracted list / resolve relative URLs" pattern
(new URL(href, page.url())) to avoid "Cannot navigate to invalid URL".
Doc-resilience (keep the API tables, demote them): add a version-provenance
header, state that the live docs supersede this snapshot on any conflict, and
make "verify signatures against the installed package" an explicit workflow
step -- so the skill fails safe as Stagehand / browser-use drift.
Validated against @browserbasehq/stagehand 3.6.0 / browser-use 0.13.1.
Eval: 9/9 compile; live Browserbase 3 PASS/3 PARTIAL/3 FAIL -> 6 PASS/2 PARTIAL/0 FAIL.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed What I didConverted 9 real E2E Test Matrix (before → after the fixes)
9/9 compile both times; live runs went 3 PASS / 3 PARTIAL / 3 FAIL → 6 PASS / 2 PARTIAL / 0 FAIL. Validated against Fixes in
|
Cryptic abbreviation -> descriptive, discoverable name: - "bu-to-bb" was opaque, and "bb" collided with the (now-deprecated) bb CLI. - "bu" is also a loaded token in browser-use land -- it's the prefix for their hosted models (bu-2-0, bu-30b), not the library -- so "bu-to-*" misreads. - The name is loaded into the system prompt as discovery metadata, so a name carrying "browser-use" + "stagehand" aids triggering, not just human scanning. - Matches the repo's descriptive naming convention (competitor-analysis, browser-trace, cookie-sync). Renames the directory, frontmatter name, SKILL.md heading, README row, and all internal /bu-to-bb references. Validator passes (node scripts/validate-skills.mjs). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…er-2 eval)
Patches the holes a tier-2 eval found by converting real-world browser-use code
(custom agent-tools, MCP client/server, and BU embedded in real app code:
LangManus, BLAST) through the skill and type-checking against installed 3.6.0.
Correctness:
- Agent custom `tools` require `experimental: true` (same gate as agent `output`
and MCP `integrations`) -- the prior tools example omitted it and would throw
ExperimentalNotConfiguredError at runtime. Consolidated callout added.
- `ai` must be pinned to v5: Stagehand 3.6.x types `tools` as the v5 ToolSet
(schema field `inputSchema`); the v4 `tool()` helper emits `parameters` and
fails to type-check. Template now pins `^5.0.0` and documents the plain-object
`{ description, inputSchema, execute }` alternative.
New mappings:
- §3.7 MCP: browser-use MCPClient (stdio + remote) -> Stagehand
`agent({ integrations })` via `connectToMCPServer(...)` / URL string, with the
experimental requirement and the tool_filter/prefix + server-direction gaps.
- §3.8 Real-world patterns: embedded/wrapped code (convert the browser-use
surface, keep app glue), sync-over-async -> async, long-lived stateful
executors -> stateless `messages`, vision intent -> `mode: hybrid/cua`, legacy
`result.final_result` attribute form.
- Custom-action gaps: injected special params (browser_session etc.) -> closures,
`domains=` -> in-tool host check, `terminates_sequence` -> no equivalent.
SKILL.md: added a "when NOT to migrate" scope gate (MCP-server / non-Agent /
embedded) so the workflow doesn't assume every input is a convertible script.
Tier-2: 5/5 handled correctly (4 convert+compile clean; MCP-server correctly
flagged out-of-scope without fabricating). Validated against stagehand 3.6.0 /
browser-use 0.13.1.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed Tier-2 matrix
5/5 handled correctly (4 convert+compile, 1 correctly-refused). Bugs/holes this round caught → fixed in
|
Live e2e smoke of the converted MCP-client script surfaced a Stagehand bug:
passing a Client instance (local/stdio server via connectToMCPServer) into
agent({ integrations }) throws "Converting circular structure to JSON" before
the agent runs -- agent() does JSON.stringify(options.integrations) (v3.ts:1992)
and the Client object is circular. Reproduced with two different MCP servers
(filesystem, everything). URL-string integrations are unaffected. The skill's
mapping is correct (it connects to the server); this is a framework bug, so §3.7
now flags it and recommends remote/URL MCP until fixed upstream.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…in example
- prompt.md + determinism.md recommended agent().execute({ output }) (and prompt.md
the agent `tools` path) without the experimental:true requirement that api-mapping
already documents -> would throw ExperimentalNotConfiguredError. Added the
experimental gate (output/tools/integrations) + prefer agent-then-extract, and
pinned ai@^5 for the tools row.
- EXAMPLES.md login: the browser-use task is "log in then open the dashboard" but
the Stagehand after-script stopped at the allow-list check. Added the dashboard
step so the migration completes the same task.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
prompt.md numbered the spectrum 1=Navigation..5=Autonomous, inverted vs determinism.md/guide.md/workflow (1=Autonomous..5=Navigation), so "Level N" citations could contradict across docs. Flipped prompt.md's table + the "decomposition (levels 2-5)" rule + the example's page.goto (L5) cite to match the canonical scale. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent-tool example took `url` from the model via inputSchema, but the original browser-use action reads the current URL from browser_session. That let the model pass a guessed URL and contradicted the section's own "close over page/stagehand in execute" guidance. The tool now takes no model args and reads page.url() directly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 649ef83. Configure here.
…ample The host check ran once after sign-in, but the added dashboard step navigated past it — and browser-use's allowed_domains enforces across the whole run. Extracted the check into assertAllowedHost() and call it after each navigation, with a comment that even this is best-effort (real continuous enforcement = Browserbase proxy domain rules, api-mapping §5). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Converts browser-use (Python) automation to Stagehand v3 (TypeScript) on Browserbase, choosing the right level of determinism per step rather than a one-to-one agentic copy.
Targets Stagehand v3, validated against
@browserbasehq/stagehand3.6.0 / browser-use 0.13.1.Hardening + validation during review
The original skill was validated by a live end-to-end eval and hardened across several commits. The eval converts real browser-use scripts through the skill (subagents reading only the skill), then runs the generated Stagehand on live Browserbase and compares against the originals + ground truth.
Tier-1 (9 browser-use example scripts): 9/9 compile; live runs 3 PASS / 3 PARTIAL / 3 FAIL → 6 PASS / 2 PARTIAL / 0 FAIL after fixes:
networkidle→domcontentloaded(times out on Google/SPA pages)outputneedsexperimental: true+ zod object (recommend agent-then-extract)setViewportSize(w, h)is positional, not{ width, height }ChatBrowserUseis not a Rust-beta tell (only abrowser_use.betaimport is)Tier-2 (real-world: agent-tools, MCP, embedded apps — LangManus / BLAST): 5/5 handled correctly (4 convert + tsc-clean, MCP-server correctly flagged out-of-scope). Also caught:
aimust be pinned to v5 — Stagehand 3.6.x types agenttoolsas the v5ToolSet(inputSchema); v4'stool()emitsparametersand won't compileintegrationsviaconnectToMCPServer/URL; server-direction = out of scope)messages, vision →mode: hybrid/cua, legacyresult.final_result)Doc-resilience (keep-but-demote): the API specifics drift every release (the variant table was already stale), so the tables are kept but demoted under a version-provenance header + "live docs supersede this snapshot" + a "verify signatures against the installed package" workflow step.
Naming: renamed
bu-to-bb→browser-use-to-stagehand(verified against the Anthropic Agent Skills naming/discovery docs).All 5 Cursor Bugbot findings addressed + threads resolved.
validateCI green. The eval harness doubles as a drift detector to re-run on each Stagehand / browser-use release.Known limitations (by design, not defects)
trace-assistedpath andprompt.mdwere not run-tested.Note
Low Risk
Documentation and agent skill content only; no production code, auth, or runtime behavior changes in the repo.
Overview
Adds a new
browser-use-to-stagehandagent skill and registers it in the root README skills table.The skill guides converting browser-use (Python) scripts to Stagehand v3 (TypeScript) on Browserbase by decomposing flows across a determinism spectrum (
page.goto,act/extract/observe, cached replay,agent()only when needed) instead of a one-to-one agent port.SKILL.mddefines scope gates (e.g. MCP server mode out of scope), variant detection, inventory, output templates, and a migration summary checklist.Supporting docs include
references/api-mapping.md(feature table, gaps likeallowed_domains, v3 gotchas),determinism.md,trace-assisted.md(Session Logs / optionalbrowser-trace), humanguide.md, pasteableprompt.md,EXAMPLES.mdbefore/after pairs, and MITLICENSE.txt.Reviewed by Cursor Bugbot for commit 828211d. Bugbot is set up for automated code reviews on this repo. Configure here.