MCP-Server-for-Browser-Automation by nandanadileep - MCP Server

Browser Automation MCP Server

This project implements a browser automation server using the Model Context Protocol (MCP) with Playwright for web automation.

Project Structure

MCP-Server/
├── browser_automation_server.py    # Browser automation MCP server
├── browser_automation_config.json  # Configuration for the server
├── test_browser_automation.py      # Basic test client
├── browser_automation_agent.py     # Advanced client with LLM agent
├── mcp-env/                        # Virtual environment with dependencies
└── README.md                       # This file

Features

The browser automation server provides the following tools:

start_browser() - Start a new browser instance
navigate_to_url(url) - Navigate to a specific URL
click_element(selector) - Click on an element using CSS selector
type_text(selector, text) - Type text into an input field
get_page_content() - Get the current page content
take_screenshot(filename) - Take a screenshot of the current page
close_browser() - Close the browser
browser_automation_task(task_description, website_url) - Perform a complete automation task

Setup

Virtual Environment: The project uses a virtual environment located in mcp-env/
Dependencies: The following packages are installed:
- fastmcp - For creating MCP servers
- mcp_use - For creating MCP clients and agents
- playwright - For browser automation
- langchain-ollama - For Ollama integration (optional)

Install Playwright Browsers:

source mcp-env/bin/activate
playwright install

Usage

1. Basic Test

Run the basic test to verify the browser automation works:

./mcp-env/bin/python test_browser_automation.py

This will:

Start a browser
Navigate to Google
Take a screenshot
Close the browser

2. Advanced Agent Mode

For LLM-powered browser automation, use the agent client:

./mcp-env/bin/python browser_automation_agent.py

This requires Ollama to be installed and running with the llama3.1:8b model.

3. Manual Tool Calls

You can also create your own client to make specific tool calls:

import asyncio
import json
from mcp_use import MCPClient

async def main():
    with open("browser_automation_config.json", "r") as f:
        config = json.load(f)

    client = MCPClient.from_dict(config)
    await client.create_all_sessions()
    
    session = client.get_session("browser_automation")
    
    # Start browser
    await session.call_tool("start_browser", {})
    
    # Navigate to a website
    await session.call_tool("navigate_to_url", {"url": "example.com"})
    
    # Take a screenshot
    await session.call_tool("take_screenshot", {"filename": "my_screenshot.png"})
    
    # Close browser
    await session.call_tool("close_browser", {})
    
    await client.close_all_sessions()

asyncio.run(main())

Configuration

The browser_automation_config.json file configures the MCP server:

{
  "mcpServers": {
    "browser_automation": {
      "command": "python",
      "args": ["browser_automation_server.py"]
    }
  }
}

Examples

Example 1: Simple Web Scraping

# Navigate to a website and get content
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com"})
content = await session.call_tool("get_page_content", {})
await session.call_tool("close_browser", {})

Example 2: Form Filling

# Fill out a form
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com/form"})
await session.call_tool("type_text", {"selector": "#name", "text": "John Doe"})
await session.call_tool("type_text", {"selector": "#email", "text": "john@example.com"})
await session.call_tool("click_element", {"selector": "#submit"})
await session.call_tool("close_browser", {})

Example 3: Screenshot Automation

# Take screenshots of multiple pages
urls = ["google.com", "github.com", "stackoverflow.com"]
for i, url in enumerate(urls):
    await session.call_tool("start_browser", {})
    await session.call_tool("navigate_to_url", {"url": url})
    await session.call_tool("take_screenshot", {"filename": f"screenshot_{i}.png"})
    await session.call_tool("close_browser", {})

Troubleshooting

Common Issues

Browser not starting: Make sure Playwright browsers are installed:
```
playwright install
```
Import errors: Use the correct Python path:
```
./mcp-env/bin/python your_script.py
```
Permission errors: Make sure the virtual environment is activated:
```
source mcp-env/bin/activate
```

Security Notes

The browser automation can access any URL, including local files
Be careful when exposing this server to untrusted users
Consider running in a sandboxed environment for production use

Development

To modify the browser automation server:

Edit browser_automation_server.py
Add new tools using the @mcp.tool() decorator
Test with test_browser_automation.py
Update the README with new features

License

This project is open source and available under the MIT License.