mm541/selenium-mcp-server
If you are the rightful owner of selenium-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
Selenium MCP Server is a Python application that provides a suite of Selenium browser automation commands via a FastMCP server, enabling AI agents and other clients to control web browsers using natural language commands.
# Selenium MCP Server
Selenium MCP Server is a standalone Python application that exposes a comprehensive suite of Selenium browser automation commands as a **FastMCP** server.
This allows a language model, AI agent, or any other client capable of calling tools to control a web browser, perform complex web scraping, and automate web-based tasks by issuing natural language commands.
## 🚀 Features
- **Browser Lifecycle**: Open and close Chrome, Firefox, or Edge instances.
- **Full Navigation**: `goto`, `back`, `forward`, `refresh`, and URL/title inspection.
- **Element Interaction**: `click`, `type`, `clear`, `get_text`, `get_attribute`.
- **Advanced Actions**: `hover`, `drag_and_drop`, `right_click`, `double_click`, `execute_javascript`.
- **Complex Targets**: Handles alerts, iframes, and Shadow DOM elements.
- **Data Extraction**: Scrape full tables into JSON, list all links, and get page source.
- **Window & Tab Management**: Open, close, and list all tabs.
- **Cookies & Storage**: Full control over browser cookies.
- **Explicit Waits**: Wait for elements to be visible or clickable to handle dynamic pages.
## 🔧 Dependencies and Installation
### Dependencies
This program relies on several third-party Python libraries:
- `fastmcp` – For creating the tool server
- `mcp-json` – For the JSON tool-calling schema with FastMCP
- `selenium` – For browser automation
- `webdriver-manager` – For automatically managing browser drivers (optional with Selenium 4.6+)
Create a `requirements.txt` file with the following content:
```txt
fastmcp
mcp-json
selenium
webdriver-manager
Installation Steps
Option 1: Using uv (Recommended)
uv is an extremely fast Python package installer and virtual environment manager.
# 1. Create virtual environment
uv venv
# 2. Activate the environment
# macOS / Linux
source .venv/bin/activate
# Windows
.\.venv\Scripts\activate
# 3. Install dependencies
uv pip install -r requirements.txt
Option 2: Using venv + pip (Traditional)
# 1. Create virtual environment
# macOS / Linux
python3 -m venv venv
# Windows
python -m venv venv
# 2. Activate the environment
# macOS / Linux
source venv/bin/activate
# Windows
.\venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
Browser Drivers
Modern versions of Selenium (4.6.0+) include Selenium Manager, which automatically downloads the correct webdriver (chromedriver, geckodriver, etc.) for your locally installed browsers the first time you run the script. No manual driver installation is required.
⚡️ Usage
Run the main script to start the server:
python main.py
By default, the server runs all browsers in headless mode (no visible UI). This is configured in the
BrowserManagerclass.
Example output:
Selenium MCP Server Starting...
Selenium MCP Server Running...
Supported Tools:
- open_browser, close_browser, goto_url, click_element, type_text, etc.
Running with a JSON Configuration (for mcp-json)
Add the server to your mcp-servers.json configuration:
{
"mcpServers": {
"Selenium Automation Server": {
"command": "uv",
"args": [
"run",
"--with",
"fastmcp",
"fastmcp",
"run",
"path/to/your/selenium_mcp_server.py"
],
"env": {},
"transport": "stdio"
}
}
}
Replace path/to/your/selenium_mcp_server.py with the actual path to your server script.
🛠 Available Tools (API Reference)
Browser Lifecycle & Navigation
| Function | Description |
|---|---|
open_browser(browser_selector) | Opens a new browser instance (firefox, chrome, edge). Defaults to Firefox. |
close_browser() | Closes the current browser instance and cleans up resources. |
is_browser_active() | Checks if the current browser is active and responsive. |
goto_url(url) | Navigates the browser to the specified URL. |
get_current_url() | Returns the URL of the current page. |
get_title() | Returns the title of the current page. |
refresh_page() | Refreshes the current page. |
go_back() | Simulates the browser Back button. |
go_forward() | Simulates the browser Forward button. |
Element Interaction
| Function | Description |
|---|---|
click_element(strategy, selector) | Clicks an element identified by strategy and selector. |
type_text(text, strategy, selector, ...) | Types text into an input field (can clear first and press Enter). |
get_element_text(strategy, selector) | Gets the visible text of an element. |
get_element_attribute(attribute, ...) | Gets an attribute from an element (e.g., href, src). |
click_shadow_dom_element(host, element) | Clicks an element inside Shadow DOM. |
double_click_element(strategy, selector) | Performs a double-click on an element. |
right_click_element(strategy, selector) | Performs a right-click (context click). |
hover_element(strategy, selector) | Simulates hovering the mouse over an element. |
drag_and_drop(source_selector, target_selector, ...) | Drags and drops an element. |
Data & Information Extraction
| Function | Description |
|---|---|
get_page_source() | Returns the full HTML source of the current page's <body>. |
extract_table_data(strategy, selector) | Extracts table data and returns it as JSON. |
list_links() | Returns a JSON list of all unique links (text + href). |
Frames, Alerts, and Tabs
| Function | Description |
|---|---|
handle_alert(accept, input_text) | Handles alert/prompt (accept/dismiss, send text). |
switch_to_frame(selector, strategy) | Switches context to an iframe. |
switch_to_default_content() | Returns context to the main document. |
open_new_tab(url) | Opens a new tab and switches to it. |
close_current_tab() | Closes the currently active tab. |
switch_tab(index) | Switches to a tab by index. |
list_all_tabs() | Returns a JSON list of all open tabs (title + URL). |
Waits & Validation
| Function | Description |
|---|---|
wait_for_element_visible(strategy, selector, timeout) | Waits for element to be present and visible. |
wait_for_element_clickable(strategy, selector, timeout) | Waits for element to be clickable. |
is_element_visible(strategy, selector) | Checks if element is currently visible (non-blocking). |
is_element_enabled(strategy, selector) | Checks if element is enabled (non-blocking). |
Cookies & Storage
| Function | Description |
|---|---|
get_cookies() | Returns all cookies for the current domain as JSON. |
add_cookie(name, value, path) | Adds a simple cookie to the current session. |
delete_all_cookies() | Deletes all cookies for the current session. |
Advanced & Visual
| Function | Description |
|---|---|
execute_javascript(script) | Executes custom JavaScript and returns the result. |
scroll_to(x, y) | Scrolls the window to absolute coordinates. |
scroll_by(x, y) | Scrolls the window by a specific amount. |
take_screenshot(filename) | Takes a viewport screenshot (returns base64 if no filename). |
save_full_page_screenshot(filename) | Attempts to capture the entire page. |