fu-tran-tpv-clv/browser-use-mcp
If you are the rightful owner of browser-use-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The MCP Browser Use Server is designed to convert natural-language manual test steps into browser automation actions, providing precise selectors and element indices.
MCP Browser Use Server
MCP Browser Use Server converts natural-language manual test steps into fully-qualified browser automation actions (returning XPath/CSS selectors and element indices). The server follows the Model Context Protocol 2024-11-05 and exposes a single /message endpoint (HTTP POST or SSE) ready for integration with GitHub Copilot, Cursor and other AI-powered IDEs.
Table of Contents
- Features
- High-Level Architecture
- Requirements
- Quick Start
- Environment Variables
- Usage
- Project Structure
- Contribution Guidelines
- License
Features
• Natural Language → Browser Automation – returns a model_actions array containing every step with precise selectors.
• MCP-compliant API – /message supports JSON-RPC 2.0 over HTTP or SSE; root / is a health check.
• Real-time Streaming – watch task progress live via Server-Sent Events.
• Session Management – automatic creation, monitoring, and clean-up of sessions.
• Multi-LLM Support – Mistral by default, switch to OpenAI, Claude, Gemini via environment variables.
• Developer-friendly Makefile – install, run, lint, test, CI in one place.
• Ant Design Prompt – built-in advanced prompt for React + AntD apps (browser_agent.py).
High-Level Architecture
graph TD;
IDE["IDE / Copilot"];
MCP["MCP Server<br/>FastAPI + asyncio"];
IDE -- "JSON-RPC 2.0<br/>HTTP & SSE" --> MCP;
MCP --> BA["Browser Agent"];
MCP --> SM["Session Manager"];
BA --> BU["browser-use"];
BU --> CH["Chrome"];
SM --> RD["Redis*"];
RD --> MT["Metrics"];
Redis persistence is planned for a future release.
Requirements
- Python 3.11+
- Google Chrome / Chromium
- macOS / Linux / Windows
Quick Start
# 1. Create virtual-env
make venv
source venv/bin/activate # Windows: .\venv\Scripts\activate
# 2. Install dependencies
make install
# 3. Set API keys
export MISTRAL_API_KEY="sk-..." # required
# optional: switch LLM provider
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."
# 4. Launch the server
make start # fastest way
# or with banner & health-monitor
make start-enhanced
After startup the service listens on http://localhost:8002.
Cursor / GitHub Copilot MCP configuration
Add the following snippet to your IDE's MCP config file (.cursor/mcp.json, .copilot/mcp.json, etc.) so that Cursor or GitHub Copilot can route tools/call requests to this server:
{
"browser-automation": {
"url": "http://localhost:8002/message"
}
}
Environment Variables
| Variable | Required | Description |
|---|---|---|
LLM_PROVIDER | ❌ | LLM provider openai, claude, gemini, mistral, etc. default openai |
LLM_API_URL | ❌ | etc. https://api.openai.com/v1/ for openai, https://api.mistral.ai/v1/ for mistral, https://api.deepseek.com/v1 for deepseek, and so on... |
MISTRAL_API_KEY | ✅ | Mistral model API key (default provider). |
OPENAI_API_KEY | ✅ | GPT-4/3.5 key if you prefer OpenAI. |
ANTHROPIC_API_KEY | ✅ | Claude key. |
GOOGLE_API_KEY | ✅ | Gemini key. |
LLM_API_KEY | ✅ | Other LLM provider key. |
MODEL_ID | ✅ | Override default model id. |
CHROME_PATH | ❌ | Explicit Chrome path (auto-detected otherwise). |
HOST / PORT | ❌ | Override host/port (default localhost:8002). |
LOG_LEVEL | ❌ | DEBUG, INFO, WARNING, … |
Usage
1. Health Check
curl http://localhost:8002/
# → {"status":"ok","active_sessions":0}
2. Initialise an MCP session
curl -X POST http://localhost:8002/message \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc": "2.0",
"id": "init-1",
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {"tools": {}},
"clientInfo": {"name": "demo", "version": "1.0"}
}
}' -i
# The response header "Mcp-Session-Id: <uuid>" contains the session id.
3. Call execute_task_steps_by_browser
curl -X POST http://localhost:8002/message \
-H 'Content-Type: application/json' \
-H "Mcp-Session-Id: <uuid>" \
-d '{
"jsonrpc": "2.0",
"id": "task-1",
"method": "tools/call",
"params": {
"name": "execute_task_steps_by_browser",
"arguments": {
"task_message": "<task-step>Go to example.com and click Login</task-step>"
}
}
}'
See for an end-to-end demonstration.
4. Real-time SSE Stream
curl -N -H "Accept:text/event-stream" http://localhost:8002/message
Project Structure
├── browser_agent.py # Browser agent & AntD prompt
├── config.py # Environment-driven configuration loader
├── streamable-browser-use-mcp-server.py # FastAPI server (single endpoint)
├── start_server.py # Advanced launcher (banner, health-check)
├── example_usage.py # API demo
├── Makefile # Developer commands
└── requirements.txt # Dependencies
Contribution Guidelines
Coding Rules
- PEP 8, full type hints, f-strings, public docstrings.
- Prefer
async/awaitfor I/O-bound code. - Run
make format lint type-checkbefore pushing.
Conventional Commits
<type>(<scope>): <short summary>
<body>
- type: feat, fix, docs, refactor, test, chore …
- scope: module or file name.
- Example:
feat(agent): support AntD Select component.
Workflow
- Fork → create feature branch from
main. - Ensure
make validatepasses (config, lint, type, tests). - Open Pull Request with detailed description.
- Maintainers will review & merge.
GitHub Copilot & Cursor
Copilot/Cursor work best when the commit scope is clear. Add a short comment at the top of new files explaining the intent so Copilot has context. Keep prompts and tests near the code they exercise.
License
Released under the MIT License. See for full text.