hubertusgbecker/mcp-browser-use-server
If you are the rightful owner of mcp-browser-use-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
The MCP Browser Use Server is a robust interface for AI agents to control web browsers using the Model Context Protocol (MCP).
mcp-browser-use-server
An MCP server that enables AI agents to control web browsers using browser-use.
Overview
The MCP Browser Use Server provides a robust, production-ready interface for AI agents to interact with web browsers through the Model Context Protocol (MCP). It supports both Server-Sent Events (SSE) and stdio transports, enabling seamless integration with various AI assistants and tools.
Key Features
- Browser Automation: Full browser control through AI agents
- Dual Transport: SSE and stdio protocol support
- VNC Streaming: Real-time browser visualization
- Async Tasks: Non-blocking browser operations
- Docker Support: Containerized deployment with docker-compose
- Persistent Sessions: Long-lived browser sessions with live inspection, tab control and content extraction (see
server/session.py) - Comprehensive Tests: Industry-ready test suite with 95%+ coverage
- Production Ready: Robust error handling and resource management
Quick Start
Prerequisites
Ensure you have the following installed:
- uv - Fast Python package manager
- Playwright - Browser automation library
- mcp-proxy - Required for stdio mode
# uv (Astral) is required for Python workflows.
# For CI (GitHub Actions) prefer the `astral-sh/setup-uv@v5` action which
# installs and configures `uv` on the runner with caching support. Example:
```yaml
- name: Set up Python and uv
uses: astral-sh/setup-uv@v5
with:
enable-cache: true
python-version: "3.11"
Install mcp-proxy globally
uv tool install mcp-proxy
Update shell to recognize new tools
uv tool update-shell
### Environment Configuration
Create a `.env` file in the project root:
```bash
# Required: OpenAI API key for LLM
# Recommended: set `LLM_MODEL` environment variable to your preferred model
# Default: `gpt-5-mini` (you can override with models like `gpt-4-turbo` or `gpt-4o` if available)
OPENAI_API_KEY=your-openai-api-key-here
LLM_MODEL=gpt-5-mini
# Optional: Custom Chrome/Chromium path
CHROME_PATH=/path/to/chrome
# Optional: Patient mode (wait for task completion)
PATIENT=false
# Optional: Logging level
LOG_LEVEL=INFO
# Control headless browser mode used by the server (true = headless, false = visible)
# Default: true
BROWSER_HEADLESS=true
Security Note: Never commit your
.envfile to version control. It's already included in.gitignore.
Installation
From Source (Development)
# Clone the repository
git clone https://github.com/hubertusgbecker/mcp-browser-use-server.git
cd mcp-browser-use-server
# Install dependencies with uv
uv sync
# Install Playwright browsers
uv pip install playwright
uv run playwright install --with-deps --no-shell chromium
As a Package (Production)
# Build and install from source
uv build
uv tool install dist/mcp_browser_use_server-*.whl
Usage
SSE Mode (Recommended for Web Interfaces)
Server-Sent Events mode is ideal for web-based integrations and remote access.
# Run from source
uv run server --port ${HOST_PORT:-8081}
# Or if installed as tool
mcp-browser-use-server run server --port ${HOST_PORT:-8081}
The server will be available at http://localhost:${HOST_PORT:-8081}/sse
stdio Mode (For Local AI Assistants)
Standard I/O mode is perfect for local AI assistants like Claude Desktop, Cursor, or Windsurf.
# Build and install as global tool
uv build
uv tool uninstall mcp-browser-use-server 2>/dev/null || true
uv tool install dist/mcp_browser_use_server-*.whl
# Run with stdio transport
mcp-browser-use-server run server --port ${HOST_PORT:-8081} --stdio --proxy-port ${PROXY_PORT:-9000}
Note: this package also exposes a CLI alias mcp-browser-use-cli that points to the same entrypoint. You can use either mcp-browser-use-server or mcp-browser-use-cli after installation.
Docker Deployment
Docker provides an isolated, reproducible environment with VNC support for visualization.
# Using docker-compose (recommended)
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down
# Or use Docker directly
docker build -t mcp-browser-use-server .
docker run --rm -p ${HOST_PORT:-8081}:8081 -p 5900:5900 \
-e OPENAI_API_KEY=your-key \
mcp-browser-use-server
Docker Compose Profiles
The docker-compose.yaml supports multiple profiles:
# Development mode with hot reload
docker-compose --profile dev up
# With stdio proxy
docker-compose --profile stdio up
# With monitoring (Prometheus + Grafana)
docker-compose --profile monitoring up
Client Configuration
SSE Mode Configuration
For web-based clients and remote connections:
{
"mcpServers": {
"mcp-browser-use-server": {
"url": "http://localhost:${HOST_PORT:-8081}/sse"
}
}
}
stdio Mode Configuration
For local AI assistants:
{
"mcpServers": {
"mcp-browser-use-server": {
"command": "mcp-browser-use-server",
"args": [
"run",
"server",
"--port",
"${HOST_PORT:-8081}",
"--stdio",
"--proxy-port",
"9000"
],
"env": {
"OPENAI_API_KEY": "your-api-key"
}
}
}
}
Configuration Paths by Client
| Client | Configuration File Path |
|---|---|
| Cursor | ./.cursor/mcp.json |
| Windsurf | ~/.codeium/windsurf/mcp_config.json |
| Claude (Mac) | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Claude (Win) | %APPDATA%\Claude\claude_desktop_config.json |
Testing
The project includes a comprehensive test suite for ensuring reliability and correctness.
Running Tests
# Run all tests
./run_tests.sh all
# Run only unit tests
./run_tests.sh unit
# Run integration tests
./run_tests.sh integration
# Run with coverage report
./run_tests.sh coverage
# Run fast tests only (skip slow tests)
./run_tests.sh fast
# Or use pytest directly
uv run pytest tests/
uv run pytest tests/ -v --cov=src --cov=server
CI
--
The repository includes a GitHub Actions workflow that runs ruff and the fast test
suite on pushes and pull requests to `main`. The workflow file is at
`.github/workflows/ci-lint-test.yml`.
Run the same checks locally before pushing:
```bash
# Format and lint with ruff (via the project's uv wrapper)
uv run ruff format .
uv run ruff check .
# Run the fast test suite (the same target used by CI)
./run_tests.sh fast
### Test Categories
- **Unit Tests**: Fast, isolated tests for individual functions
- **Integration Tests**: Test component interactions
- **E2E Tests**: Full system tests (require `RUN_E2E_TESTS=true`)
- **Performance Tests**: Load and performance validation
- **MAGG Integration Tests**: Shell script for testing Magg aggregator integration
### MAGG Integration Testing
The project includes a shell script for validating integration with the [Magg](https://github.com/sparfenyuk/magg) aggregator:
```bash
# Run MAGG integration test
./tests/test_magg_integration.sh
# The script automatically:
# - Reads HOST_PORT from .env file if present
# - Checks Docker container status and health
# - Verifies Magg configuration
# - Tests browser tools availability through Magg
# - Validates end-to-end browser automation (hubertusbecker.com summary)
Requirements:
- Docker container running on the configured port (default: 8081)
- Magg running and configured (see
.magg/config.json) - The
mbrocommand-line tool available
Port Configuration: The script automatically reads HOST_PORT from .env:
# .env
HOST_PORT=8083 # Match your docker-compose port mapping
If the health check fails, the script provides troubleshooting steps.
CI/CD Integration
Tests are designed to run in CI/CD pipelines. See .github/workflows/ for GitHub Actions examples.
VNC Browser Visualization
Watch browser automation in real-time using VNC:
# Start server with VNC (Docker)
docker-compose up -d
# Connect using noVNC (browser-based)
git clone https://github.com/novnc/noVNC
cd noVNC
./utils/novnc_proxy --vnc localhost:5900
# Or use any VNC client
# Default password: browser-use
Then open http://localhost:6080/vnc.html in your browser to watch the automation.
Development
Local Development Workflow
# Install development dependencies
uv sync --all-extras
# Run linters and formatters
uv run ruff check .
uv run ruff format .
uv run black .
uv run isort .
# Type checking
uvx ty check .
# Run tests during development
uv run pytest tests/ -v
# Build package
uv build
Project Structure
mcp-browser-use-server/
├── src/
│ └── mcp_browser_use_server/ # Package source
│ ├── __init__.py
│ ├── cli.py # CLI interface
│ └── server.py # Server re-exports
├── server/
│ ├── __init__.py
│ ├── __main__.py
│ └── server.py # Core server implementation
├── tests/ # Comprehensive test suite
│ ├── conftest.py # Pytest fixtures
│ ├── test_config.py # Configuration tests
│ ├── test_browser_tasks.py # Browser task tests
│ ├── test_mcp_server.py # MCP server tests
│ ├── test_integration.py # Integration tests
│ ├── test_e2e.py # End-to-end tests
│ └── test_performance.py # Performance tests
├── docker-compose.yaml # Docker orchestration
├── Dockerfile # Container definition
├── pyproject.toml # Project configuration
├── pytest.ini # Test configuration
└── README.md # This file
Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes and add tests
- Run the test suite (
./run_tests.sh all) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Code Quality Standards
- Test Coverage: Maintain >95% coverage
- Type Hints: All functions must have type annotations
- Documentation: Docstrings for all public APIs
- Linting: Code must pass ruff and ty checks
- Formatting: Use black and isort for consistent formatting
Example Usage
Basic Browser Task
Ask your AI assistant:
Navigate to https://news.ycombinator.com and return the top ranked article's title and URL
Advanced Examples
# Search and extract
Go to Google, search for "MCP protocol", and summarize the first 3 results
# Form interaction
Navigate to the contact form at example.com/contact, fill in the fields with test data, and submit
# Data extraction
Visit GitHub's trending page and list the top 5 trending repositories today
# Screenshot capture
Navigate to example.com and take a screenshot of the homepage
API Reference
Available MCP Tools
The server exposes the following tools via MCP:
-
run_browser_task: Execute a browser automation task- Parameters:
instruction(string),task_id(optional) - Returns: Task ID for async tracking
- Parameters:
-
get_task_status: Check the status of a running task- Parameters:
task_id(string) - Returns: Task status, progress, and results
- Parameters:
-
cancel_task: Cancel a running browser task- Parameters:
task_id(string) - Returns: Cancellation confirmation
- Parameters:
-
list_all_tasks: List all tasks- Returns: Array of all tasks with their statuses
New session and browser tools
-
browser_get_state: Get current browser state for a task or live session- Parameters:
task_id(optional) orsession_id(optional),screenshot(bool) - Returns: JSON state summary;
screenshot=truereturns base64 PNG
- Parameters:
-
browser_navigate: Navigate an existing or new live session to a URL- Parameters:
url(string), optionalsession_id - Returns: Confirmation and updated session state
- Parameters:
-
browser_click: Click an element in a live session (by index)- Parameters:
session_id,element_index - Returns: Action result
- Parameters:
-
browser_extract_content: Use session extraction (LLM-assisted) to extract or summarize content- Parameters:
session_id,instruction(string) - Returns: Extracted text or structured result
- Parameters:
-
Session management:
browser_list_sessions,browser_close_sessionto list and close long-lived sessions -
Tabs API:
browser_list_tabs,browser_switch_tab,browser_close_tabfor tab control
These tools are implemented in server/server.py and rely on the session manager server/session.py.
Available MCP Resources
- Task Results: Access completed task results via
task://{task_id} - Task History: View task execution history
Troubleshooting
Common Issues
Issue: OPENAI_API_KEY not set
# Solution: Set the API key in .env or environment
export OPENAI_API_KEY=your-key-here
Issue: Chrome/Chromium not found
# Solution: Install Chromium via Playwright
uv run playwright install chromium --with-deps
# Or specify custom path in .env
CHROME_PATH=/usr/bin/chromium-browser
Issue: Tests failing
# Solution: Ensure all dependencies are installed
uv sync --all-extras
uv run playwright install chromium
# Run tests with verbose output
uv run pytest tests/ -vv
Issue: Port already in use
# Solution: Use a different port
uv run server --port 8001
# Or find and kill the process using the port
lsof -ti:8081 | xargs kill -9
Debug Mode
Enable detailed logging for troubleshooting:
# Set log level in .env
LOG_LEVEL=DEBUG
# Or via environment variable
LOG_LEVEL=DEBUG uv run server --port 8081
Architecture
The server follows a modular architecture:
- MCP Server: Handles protocol communication (SSE/stdio)
- Browser Manager: Manages browser instances and contexts
- Task Queue: Async task execution and tracking
- Agent System: AI-powered browser automation via browser-use
- Resource Manager: Memory and cleanup management
See for detailed architecture documentation.
Performance
- Concurrent Tasks: Supports multiple simultaneous browser tasks
- Memory Management: Automatic cleanup of old tasks and browser contexts
- Resource Limits: Configurable via Docker compose
- Async Operations: Non-blocking task execution
Typical performance metrics:
- Task startup: <2 seconds
- Simple navigation: 3-5 seconds
- Complex automation: 10-30 seconds
- Memory per browser: 200-500 MB
Security
Best Practices
- Never commit
.envfiles or API keys to version control - Use Docker secrets for production deployments
- Limit exposed ports in production
- Set resource limits via Docker to prevent DoS
- Validate all user inputs in browser tasks
- Use HTTPS for SSE mode in production
Environment Isolation
Docker provides strong isolation. For additional security:
# In docker-compose.yaml
security_opt:
- no-new-privileges:true
read_only: true
tmpfs:
- /tmp
- /var/tmp
License
This project is licensed under the MIT License. See for details.
Copyright (c) 2025 Dr. Hubertus Becker
Acknowledgments
- Built on browser-use for browser automation
- Uses MCP for AI agent communication
- Powered by Playwright for browser control
- Package management via uv
Support & Contact
- GitHub Issues: Report bugs or request features
- Discussions: Join the conversation
- Email: For private inquiries, contact via GitHub
Changelog
Version 0.9.5 (2025-11-11)
- Complete rebranding and optimization
- Comprehensive test suite added
- Docker Compose support
- Enhanced documentation
- Production-ready deployment configuration