Browser Use MCP by Cam10001110101 - MCP Server

MCP Browser Automation with Ollama

A powerful browser automation system that enables AI agents to control web browsers through the Model Context Protocol (MCP). This implementation is specifically designed to work with Ollama local models, providing a secure and efficient way to automate browser interactions using locally-hosted AI models.

Features

MCP Integration: Full support for Model Context Protocol for structured AI-browser communication
Ollama Model Support: Optimized for local AI models running through Ollama
Browser Control: Complete browser automation with Playwright (Chrome, Firefox, Safari)
AI-Driven Automation: Natural language browser control via local LLMs
Screenshot Capabilities: Visual feedback and debugging support
Session Management: Multiple browser sessions with automatic cleanup
Interactive Mode: Continuous feedback loop between AI and browser state
Optimized Display: Browser launches maximized (1920x1080) to minimize scrolling

Quick Start

Prerequisites

Python 3.8+
Ollama installed and running
uv package manager (recommended)

Installation

# Clone the repository
git clone https://github.com/Cam10001110101/mcp-server-browser-use-ollama
cd mcp-server-browser-use-ollama

# Install with uv (recommended)
uv pip install -e .
playwright install

# Start Ollama and pull a model
ollama serve  # In one terminal
ollama pull qwen3  # In another terminal

Usage

The system can be used in two modes:

Option 1: Direct MCP Integration (with Claude Desktop)

Configure in claude_desktop_config.json:

{
  "mcpServers": {
    "browser-use-ollama": {
      "command": "/path/to/.venv/bin/python",
      "args": ["/path/to/src/server.py"]
    }
  }
}

Option 2: Ollama-Driven Automation

# Interactive automation with conversation history
python src/client.py src/server.py

# Custom task via command line
python src/client.py src/server.py "Navigate to Google and search for 'Ollama models'"

# Complex task from file
python src/client.py src/server.py task_description.txt --file

# With custom model
python src/client.py src/server.py "Your task" --model llama3.2:latest

Available Tools

The MCP server provides 10 browser automation tools:

launch_browser(url) - Launch browser and navigate to URL
click_element(session_id, x, y) - Click at coordinates
click_selector(session_id, selector) - Click element by CSS selector
type_text(session_id, text) - Type text at current position
scroll_page(session_id, direction) - Scroll page up/down
get_page_content(session_id) - Extract page text content
get_dom_structure(session_id, max_depth) - Get DOM tree
extract_data(session_id, pattern) - Extract structured data
take_screenshot(session_id) - Capture screenshot
close_browser(session_id) - Close browser session

Examples

Basic Web Search

python src/client.py src/server.py "Search for 'Ollama models' on Google and summarize the top 3 results"

E-commerce Analysis

python src/client.py src/server.py "Compare wireless headphones on Amazon - create a table with prices, ratings, and features"

Research Workflow

python src/client.py src/server.py "Research transformer architecture improvements in 2024, visit 5 sources, and compile a summary"

File-based Complex Tasks

# Create a task file
echo "Navigate to GitHub, search for MCP repositories, and analyze the top 5 results" > my_task.txt

# Run the task
python src/client.py src/server.py my_task.txt --file

Environment Variables

OLLAMA_MODEL: Specify Ollama model (default: qwen3)
OLLAMA_HOST: Ollama API endpoint (default: http://localhost:11434)

Testing

# Run pure MCP tests (recommended)
pytest tests/test_server_mcp.py -v

# Run all tests
pytest

# Run specific test categories
pytest tests/test_server_mcp.py    # Pure MCP implementation tests
pytest tests/test_integration.py   # Integration tests

Project Structure

mcp-server-browser-use-ollama/
├── src/                    # Core source code
│   ├── server.py          # MCP server implementation
│   └── client.py          # Interactive client with full automation capabilities
├── tests/                  # Test suite
├── docs/                   # Additional documentation
├── pyproject.toml         # Project configuration
└── README.md              # This file

Architecture

The system uses a client-server architecture with MCP protocol:

User → Client → MCP Protocol → Server → Playwright Browser

Server: Pure MCP SDK server providing browser automation tools
Client: Langchain-Ollama integration for natural language processing
Transport: stdio-based MCP communication
Browser: Playwright automation for cross-browser support

Key Features

Interactive Feedback Loop

The client maintains a continuous dialogue with Ollama for dynamic automation:

Ollama receives results after each action
Can adjust strategy based on browser state
Maintains full conversation history for context
Supports both command-line and file-based task input

Advanced Capabilities

Conversation History: 32k token context window for complex multi-step tasks
Action Parsing: JSON and heuristic parsing of LLM responses
File Input: Support for complex task descriptions from files
Model Selection: Easy switching between Ollama models
Debug Mode: Comprehensive logging for troubleshooting

Flexible Model Support

Works with any Ollama-compatible model
Optimized for coding models (qwen3, qwen2.5-coder:7b)
Configurable context windows and parameters
Temperature=0 for deterministic outputs

Robust Error Handling

Automatic browser session cleanup
Graceful recovery from parsing errors
Comprehensive logging for debugging

Cam10001110101/mcp-server-browser-use-ollama

MCP Browser Automation with Ollama

Features

Quick Start

Prerequisites

Installation

Usage

Option 1: Direct MCP Integration (with Claude Desktop)

Option 2: Ollama-Driven Automation

Available Tools

Examples

Basic Web Search

E-commerce Analysis

Research Workflow

File-based Complex Tasks

Environment Variables

Testing

Project Structure

Architecture

Key Features

Interactive Feedback Loop

Advanced Capabilities

Flexible Model Support

Robust Error Handling

Related MCP Servers

mcp-playwright

deepwiki-mcp

tavily-mcp

brightdata-mcp

Fetch

mcp-local-rag

duckduckgo-mcp-server

mcp-server-serper

ai-agent-marketplace-index-mcp

mcp-browser-use

mcp-server

mcp-selenium

web-eval-agent

browser-tools-mcp

Notte Browser

cursor-talk-to-figma-mcp

puppeteer-mcp-server

Redbook-Search-Comment-MCP2.0

g-search-mcp

playwright-plus-python-mcp

crawl4ai-mcp-server

xhs-mcp-server

mcp-tool-kit

Sketch-Context-MCP

mcp-ui

chrome-tools-MCP

visual-ui-debug-agent-mcp