MCP-Server-for-Browser-Automation

nandanadileep/MCP-Server-for-Browser-Automation

3.3

If you are the rightful owner of MCP-Server-for-Browser-Automation and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

This project demonstrates a simple Model Context Protocol (MCP) server and client implementation using `fastmcp` and `mcp_use`.

Tools
1
Resources
0
Prompts
0

Browser Automation MCP Server

This project implements a browser automation server using the Model Context Protocol (MCP) with Playwright for web automation.

Project Structure

MCP-Server/
├── browser_automation_server.py    # Browser automation MCP server
├── browser_automation_config.json  # Configuration for the server
├── test_browser_automation.py      # Basic test client
├── browser_automation_agent.py     # Advanced client with LLM agent
├── mcp-env/                        # Virtual environment with dependencies
└── README.md                       # This file

Features

The browser automation server provides the following tools:

  • start_browser() - Start a new browser instance
  • navigate_to_url(url) - Navigate to a specific URL
  • click_element(selector) - Click on an element using CSS selector
  • type_text(selector, text) - Type text into an input field
  • get_page_content() - Get the current page content
  • take_screenshot(filename) - Take a screenshot of the current page
  • close_browser() - Close the browser
  • browser_automation_task(task_description, website_url) - Perform a complete automation task

Setup

  1. Virtual Environment: The project uses a virtual environment located in mcp-env/

  2. Dependencies: The following packages are installed:

    • fastmcp - For creating MCP servers
    • mcp_use - For creating MCP clients and agents
    • playwright - For browser automation
    • langchain-ollama - For Ollama integration (optional)
  3. Install Playwright Browsers:

    source mcp-env/bin/activate
    playwright install
    

Usage

1. Basic Test

Run the basic test to verify the browser automation works:

./mcp-env/bin/python test_browser_automation.py

This will:

  • Start a browser
  • Navigate to Google
  • Take a screenshot
  • Close the browser

2. Advanced Agent Mode

For LLM-powered browser automation, use the agent client:

./mcp-env/bin/python browser_automation_agent.py

This requires Ollama to be installed and running with the llama3.1:8b model.

3. Manual Tool Calls

You can also create your own client to make specific tool calls:

import asyncio
import json
from mcp_use import MCPClient

async def main():
    with open("browser_automation_config.json", "r") as f:
        config = json.load(f)

    client = MCPClient.from_dict(config)
    await client.create_all_sessions()
    
    session = client.get_session("browser_automation")
    
    # Start browser
    await session.call_tool("start_browser", {})
    
    # Navigate to a website
    await session.call_tool("navigate_to_url", {"url": "example.com"})
    
    # Take a screenshot
    await session.call_tool("take_screenshot", {"filename": "my_screenshot.png"})
    
    # Close browser
    await session.call_tool("close_browser", {})
    
    await client.close_all_sessions()

asyncio.run(main())

Configuration

The browser_automation_config.json file configures the MCP server:

{
  "mcpServers": {
    "browser_automation": {
      "command": "python",
      "args": ["browser_automation_server.py"]
    }
  }
}

Examples

Example 1: Simple Web Scraping

# Navigate to a website and get content
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com"})
content = await session.call_tool("get_page_content", {})
await session.call_tool("close_browser", {})

Example 2: Form Filling

# Fill out a form
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com/form"})
await session.call_tool("type_text", {"selector": "#name", "text": "John Doe"})
await session.call_tool("type_text", {"selector": "#email", "text": "john@example.com"})
await session.call_tool("click_element", {"selector": "#submit"})
await session.call_tool("close_browser", {})

Example 3: Screenshot Automation

# Take screenshots of multiple pages
urls = ["google.com", "github.com", "stackoverflow.com"]
for i, url in enumerate(urls):
    await session.call_tool("start_browser", {})
    await session.call_tool("navigate_to_url", {"url": url})
    await session.call_tool("take_screenshot", {"filename": f"screenshot_{i}.png"})
    await session.call_tool("close_browser", {})

Troubleshooting

Common Issues

  1. Browser not starting: Make sure Playwright browsers are installed:

    playwright install
    
  2. Import errors: Use the correct Python path:

    ./mcp-env/bin/python your_script.py
    
  3. Permission errors: Make sure the virtual environment is activated:

    source mcp-env/bin/activate
    

Security Notes

  • The browser automation can access any URL, including local files
  • Be careful when exposing this server to untrusted users
  • Consider running in a sandboxed environment for production use

Development

To modify the browser automation server:

  1. Edit browser_automation_server.py
  2. Add new tools using the @mcp.tool() decorator
  3. Test with test_browser_automation.py
  4. Update the README with new features

License

This project is open source and available under the MIT License.