Add live browser action tools for direct Selenium control by meenurani1 · Pull Request #5 · sourcefuse/robotframework-mcp

meenurani1 · 2026-06-02T18:54:38Z

Summary

Adds 8 new MCP tools that let an AI agent drive a real browser step by step, without needing to generate or execute a Robot Framework `.robot` file
All tools share a module-level WebDriver session and support Robot Framework-style selector prefixes (`id=`, `css=`, `xpath=`, `name=`, `class=`, `tag=`, `link=`, `partial_link=`), with plain CSS as the default fallback
Fixes a dead-code bug in `create_extended_selenium_keywords` where an unreferenced `return template` statement followed the actual return

New Tools

Tool	Description
`browser_launch`	Open Chrome or Firefox (headless optional), navigate to URL
`browser_navigate`	Go to a new URL in the active session
`browser_click`	Click an element, waits for it to be clickable first
`browser_send_keys`	Type into an input, clears existing text by default
`browser_get_text`	Read visible text from an element
`browser_wait_for_element`	Wait for `visible`, `present`, `clickable`, or `hidden` state
`browser_screenshot`	Save a timestamped PNG and return its absolute path
`browser_close`	Quit the browser and clean up the session

Example Usage

```python
browser_launch("https://example.com", browser="Chrome", headless=False)
browser_send_keys("id=username", "admin")
browser_send_keys("id=password", "secret")
browser_click("css=button[type='submit']")
browser_screenshot("results/after_login.png")
browser_close()
```

Test Plan

`browser_launch` opens Chrome/Firefox and navigates to the URL
`browser_send_keys` types into input fields correctly
`browser_click` clicks buttons and links
`browser_get_text` returns correct element text
`browser_wait_for_element` respects all 4 states (visible, present, clickable, hidden)
`browser_screenshot` saves PNG and returns correct path
`browser_close` closes session cleanly
All tools return clear error messages when no session is open
Selector prefixes `id=`, `css=`, `xpath=`, `name=` all resolve correctly

Note: ChromeDriver must match the installed Chrome version. Update via `brew upgrade chromedriver` if versions diverge.

Adds 8 new MCP tools that let an AI agent drive a real browser step by step without generating or running a Robot Framework file: - browser_launch — open Chrome/Firefox (headless supported), navigate to URL - browser_navigate — go to a new URL in the active session - browser_click — click an element, waits for it to be clickable - browser_send_keys — type into an input, clears existing text by default - browser_get_text — read visible text from an element - browser_wait_for_element — wait for visible/present/clickable/hidden state - browser_screenshot — save a timestamped PNG and return its path - browser_close — quit the browser and clean up the session All tools share a module-level WebDriver session and support Robot Framework-style selector prefixes (id=, css=, xpath=, name=, class=, tag=, link=, partial_link=) with plain CSS as the default fallback. Also fixes a dead-code bug in create_extended_selenium_keywords where an unreferenced return template statement followed the actual return.

Copilot

Pull request overview

This PR extends the MCP server beyond Robot Framework code generation by adding “live” Selenium-driven browser action tools that operate against a shared module-level WebDriver session, enabling step-by-step browser control via MCP tool calls. It also removes a dead/unreachable return template in create_extended_selenium_keywords.

Changes:

Added 8 new browser_* MCP tools for launching, navigating, clicking, typing, reading text, waiting for elements, taking screenshots, and closing a shared Selenium session.
Implemented Robot Framework-style selector prefixes (id=, css=, xpath=, etc.) with CSS as the fallback.
Removed unreachable dead code in create_extended_selenium_keywords.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            service = webdriver.ChromeService()
+            _driver = webdriver.Chrome(options=opts, service=service)


+            service = webdriver.ChromeService()
+            _driver = webdriver.Chrome(options=opts, service=service)
+
+        _driver.maximize_window()


+        if clear_first:
+            element.clear()
+        element.send_keys(text)
+        return f"Typed into {selector}: '{text}'"


+        if browser_lower == "firefox":
+            opts = FirefoxOptions()
+            if headless:
+                opts.add_argument("--headless")
+            _driver = webdriver.Firefox(options=opts)
+        else:
+            opts = ChromeOptions()


Copilot AI review requested due to automatic review settings June 2, 2026 18:54

Copilot started reviewing on behalf of meenurani1 June 2, 2026 18:54 View session

meenurani1 merged commit 17e9ef8 into main Jun 2, 2026
1 check passed

Copilot AI reviewed Jun 2, 2026

View reviewed changes

meenurani1 mentioned this pull request Jun 2, 2026

Add CI/CD pipeline generation tool and update README #6

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add live browser action tools for direct Selenium control#5

Add live browser action tools for direct Selenium control#5
meenurani1 merged 1 commit into
mainfrom
feature/browser-action-tools

meenurani1 commented Jun 2, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		service = webdriver.ChromeService()
		_driver = webdriver.Chrome(options=opts, service=service)

Conversation

meenurani1 commented Jun 2, 2026

Summary

New Tools

Example Usage

Test Plan

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants