browser-use-mcp by Matt-MFG - MCP Server

Browser-Use MCP Server

A Model Context Protocol (MCP) server for browser automation using Playwright, deployed on Google Cloud Run. This project provides browser automation capabilities through both REST API and MCP protocol endpoints.

Features

🌐 Browser Automation Tools:
- Navigate to URLs
- Take screenshots
- Click elements
- Fill form fields
- Read page content
🔧 Dual Protocol Support:
- REST API endpoints for direct HTTP access
- MCP protocol for AI agent integration
☁️ Cloud-Native Design:
- Deployed on Google Cloud Run
- Docker containerized
- Auto-scaling and serverless
🔒 Security Features:
- Domain allowlist
- Google Cloud identity token authentication
- Audit logging

Quick Start

Local Development

Clone the repository:

git clone https://github.com/yourusername/browser-use-mcp.git
cd browser-use-mcp

Set up Python environment:

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
playwright install chromium

Run the MCP server:

uvicorn app:app --host 0.0.0.0 --port 8080

Test with the chat interface:

# In another terminal
python3 -m http.server 8090
# Open http://localhost:8090/chat_interface_v2.html

Using the MCP Server

REST API Examples

# Take a screenshot
curl -X POST http://localhost:8080/mcp/tools/screenshot \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "full_page": false}'

# Navigate to a URL
curl -X POST http://localhost:8080/mcp/tools/navigate \
  -H "Content-Type: application/json" \
  -d '{"url": "https://github.com"}'

# Read page content
curl -X POST http://localhost:8080/mcp/tools/read \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "selector": "h1"}'

Python Client Example

import asyncio
import aiohttp

async def take_screenshot():
    async with aiohttp.ClientSession() as session:
        async with session.post(
            "http://localhost:8080/mcp/tools/screenshot",
            json={"url": "https://github.com", "full_page": False}
        ) as resp:
            result = await resp.json()
            # result contains base64 encoded screenshot

asyncio.run(take_screenshot())

Deployment

Google Cloud Run

Build and deploy:

gcloud builds submit --config cloudbuild.yaml

Environment variables:
- ALLOWED_DOMAINS: Comma-separated list of allowed domains
- LOG_LEVEL: Logging level (default: INFO)

Project Structure

browser-use-mcp/
├── app.py                    # Main FastAPI application
├── mcp_tools.py             # Browser automation tool implementations
├── browser_manager.py       # Centralized browser lifecycle management
├── security_middleware.py   # Google Cloud logging integration
├── chat_interface_v2.html   # Web-based chat interface
├── requirements.txt         # Python dependencies
├── Dockerfile              # Container configuration
├── cloudbuild.yaml         # CI/CD pipeline
└── agents/                 # ADK agent configurations

Available Tools

Screenshot

Capture a screenshot of a webpage.

{
  "url": "https://example.com",
  "full_page": false
}

Navigate

Navigate to a URL and return page information.

{
  "url": "https://example.com"
}

Click

Click an element on the page.

{
  "url": "https://example.com",
  "selector": "button#submit",
  "return_screenshot": false
}

Fill

Fill a form field with text.

{
  "url": "https://example.com",
  "selector": "input[name='email']",
  "text": "user@example.com",
  "return_screenshot": false
}

Read

Extract text content from the page.

{
  "url": "https://example.com",
  "selector": "article"
}

Development

Running Tests

python test_all_tools.py

Building Docker Image

docker build -t browser-use-mcp .
docker run -p 8080:8080 browser-use-mcp

Troubleshooting

Common Issues

CORS errors in browser: The server includes CORS middleware. Make sure you're accessing from allowed origins.
Playwright browser issues: Ensure Chromium is installed:
```
playwright install chromium
```
Port already in use: Check for running processes:
```
lsof -i :8080
```

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with FastAPI
Browser automation powered by Playwright
MCP protocol implementation using FastMCP
Deployed on Google Cloud Run