nandanadileep/MCP-Server-for-Browser-Automation
If you are the rightful owner of MCP-Server-for-Browser-Automation and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
This project demonstrates a simple Model Context Protocol (MCP) server and client implementation using `fastmcp` and `mcp_use`.
Browser Automation MCP Server
This project implements a browser automation server using the Model Context Protocol (MCP) with Playwright for web automation.
Project Structure
MCP-Server/
├── browser_automation_server.py # Browser automation MCP server
├── browser_automation_config.json # Configuration for the server
├── test_browser_automation.py # Basic test client
├── browser_automation_agent.py # Advanced client with LLM agent
├── mcp-env/ # Virtual environment with dependencies
└── README.md # This file
Features
The browser automation server provides the following tools:
- start_browser() - Start a new browser instance
- navigate_to_url(url) - Navigate to a specific URL
- click_element(selector) - Click on an element using CSS selector
- type_text(selector, text) - Type text into an input field
- get_page_content() - Get the current page content
- take_screenshot(filename) - Take a screenshot of the current page
- close_browser() - Close the browser
- browser_automation_task(task_description, website_url) - Perform a complete automation task
Setup
-
Virtual Environment: The project uses a virtual environment located in
mcp-env/ -
Dependencies: The following packages are installed:
fastmcp- For creating MCP serversmcp_use- For creating MCP clients and agentsplaywright- For browser automationlangchain-ollama- For Ollama integration (optional)
-
Install Playwright Browsers:
source mcp-env/bin/activate playwright install
Usage
1. Basic Test
Run the basic test to verify the browser automation works:
./mcp-env/bin/python test_browser_automation.py
This will:
- Start a browser
- Navigate to Google
- Take a screenshot
- Close the browser
2. Advanced Agent Mode
For LLM-powered browser automation, use the agent client:
./mcp-env/bin/python browser_automation_agent.py
This requires Ollama to be installed and running with the llama3.1:8b model.
3. Manual Tool Calls
You can also create your own client to make specific tool calls:
import asyncio
import json
from mcp_use import MCPClient
async def main():
with open("browser_automation_config.json", "r") as f:
config = json.load(f)
client = MCPClient.from_dict(config)
await client.create_all_sessions()
session = client.get_session("browser_automation")
# Start browser
await session.call_tool("start_browser", {})
# Navigate to a website
await session.call_tool("navigate_to_url", {"url": "example.com"})
# Take a screenshot
await session.call_tool("take_screenshot", {"filename": "my_screenshot.png"})
# Close browser
await session.call_tool("close_browser", {})
await client.close_all_sessions()
asyncio.run(main())
Configuration
The browser_automation_config.json file configures the MCP server:
{
"mcpServers": {
"browser_automation": {
"command": "python",
"args": ["browser_automation_server.py"]
}
}
}
Examples
Example 1: Simple Web Scraping
# Navigate to a website and get content
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com"})
content = await session.call_tool("get_page_content", {})
await session.call_tool("close_browser", {})
Example 2: Form Filling
# Fill out a form
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": "https://example.com/form"})
await session.call_tool("type_text", {"selector": "#name", "text": "John Doe"})
await session.call_tool("type_text", {"selector": "#email", "text": "john@example.com"})
await session.call_tool("click_element", {"selector": "#submit"})
await session.call_tool("close_browser", {})
Example 3: Screenshot Automation
# Take screenshots of multiple pages
urls = ["google.com", "github.com", "stackoverflow.com"]
for i, url in enumerate(urls):
await session.call_tool("start_browser", {})
await session.call_tool("navigate_to_url", {"url": url})
await session.call_tool("take_screenshot", {"filename": f"screenshot_{i}.png"})
await session.call_tool("close_browser", {})
Troubleshooting
Common Issues
-
Browser not starting: Make sure Playwright browsers are installed:
playwright install -
Import errors: Use the correct Python path:
./mcp-env/bin/python your_script.py -
Permission errors: Make sure the virtual environment is activated:
source mcp-env/bin/activate
Security Notes
- The browser automation can access any URL, including local files
- Be careful when exposing this server to untrusted users
- Consider running in a sandboxed environment for production use
Development
To modify the browser automation server:
- Edit
browser_automation_server.py - Add new tools using the
@mcp.tool()decorator - Test with
test_browser_automation.py - Update the README with new features
License
This project is open source and available under the MIT License.