browser-use-mcp

Matt-MFG/browser-use-mcp

3.1

If you are the rightful owner of browser-use-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

A Model Context Protocol (MCP) server for browser automation using Playwright, deployed on Google Cloud Run.

Tools
5
Resources
0
Prompts
0

Browser-Use MCP Server

A Model Context Protocol (MCP) server for browser automation using Playwright, deployed on Google Cloud Run. This project provides browser automation capabilities through both REST API and MCP protocol endpoints.

Features

  • 🌐 Browser Automation Tools:

    • Navigate to URLs
    • Take screenshots
    • Click elements
    • Fill form fields
    • Read page content
  • 🔧 Dual Protocol Support:

    • REST API endpoints for direct HTTP access
    • MCP protocol for AI agent integration
  • ☁️ Cloud-Native Design:

    • Deployed on Google Cloud Run
    • Docker containerized
    • Auto-scaling and serverless
  • 🔒 Security Features:

    • Domain allowlist
    • Google Cloud identity token authentication
    • Audit logging

Quick Start

Local Development

  1. Clone the repository:
git clone https://github.com/yourusername/browser-use-mcp.git
cd browser-use-mcp
  1. Set up Python environment:
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
playwright install chromium
  1. Run the MCP server:
uvicorn app:app --host 0.0.0.0 --port 8080
  1. Test with the chat interface:
# In another terminal
python3 -m http.server 8090
# Open http://localhost:8090/chat_interface_v2.html

Using the MCP Server

REST API Examples
# Take a screenshot
curl -X POST http://localhost:8080/mcp/tools/screenshot \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "full_page": false}'

# Navigate to a URL
curl -X POST http://localhost:8080/mcp/tools/navigate \
  -H "Content-Type: application/json" \
  -d '{"url": "https://github.com"}'

# Read page content
curl -X POST http://localhost:8080/mcp/tools/read \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "selector": "h1"}'
Python Client Example
import asyncio
import aiohttp

async def take_screenshot():
    async with aiohttp.ClientSession() as session:
        async with session.post(
            "http://localhost:8080/mcp/tools/screenshot",
            json={"url": "https://github.com", "full_page": False}
        ) as resp:
            result = await resp.json()
            # result contains base64 encoded screenshot

asyncio.run(take_screenshot())

Deployment

Google Cloud Run

  1. Build and deploy:
gcloud builds submit --config cloudbuild.yaml
  1. Environment variables:
    • ALLOWED_DOMAINS: Comma-separated list of allowed domains
    • LOG_LEVEL: Logging level (default: INFO)

Project Structure

browser-use-mcp/
├── app.py                    # Main FastAPI application
├── mcp_tools.py             # Browser automation tool implementations
├── browser_manager.py       # Centralized browser lifecycle management
├── security_middleware.py   # Google Cloud logging integration
├── chat_interface_v2.html   # Web-based chat interface
├── requirements.txt         # Python dependencies
├── Dockerfile              # Container configuration
├── cloudbuild.yaml         # CI/CD pipeline
└── agents/                 # ADK agent configurations

Available Tools

Screenshot

Capture a screenshot of a webpage.

{
  "url": "https://example.com",
  "full_page": false
}

Navigate

Navigate to a URL and return page information.

{
  "url": "https://example.com"
}

Click

Click an element on the page.

{
  "url": "https://example.com",
  "selector": "button#submit",
  "return_screenshot": false
}

Fill

Fill a form field with text.

{
  "url": "https://example.com",
  "selector": "input[name='email']",
  "text": "user@example.com",
  "return_screenshot": false
}

Read

Extract text content from the page.

{
  "url": "https://example.com",
  "selector": "article"
}

Development

Running Tests

python test_all_tools.py

Building Docker Image

docker build -t browser-use-mcp .
docker run -p 8080:8080 browser-use-mcp

Troubleshooting

Common Issues

  1. CORS errors in browser: The server includes CORS middleware. Make sure you're accessing from allowed origins.

  2. Playwright browser issues: Ensure Chromium is installed:

    playwright install chromium
    
  3. Port already in use: Check for running processes:

    lsof -i :8080
    

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments