botdojo-ai/local-browser-mcp-server
If you are the rightful owner of local-browser-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Local Browser MCP Server is a versatile tool for browser automation and AI image generation, leveraging Puppeteer and Google Gemini 2.0 Flash.
Local Browser MCP Server
A Model Context Protocol (MCP) server that provides browser automation capabilities using Puppeteer, plus AI image generation using Google Gemini 2.0 Flash. This server allows you to control a local Chrome browser instance, take screenshots, perform interactions like clicking and typing, and generate custom images - perfect for testing websites and creating content with Cursor.
Features
- 🌐 Browser Navigation: Navigate to URLs, go back/forward, refresh pages
- 📸 Screenshots: Capture full page, viewport, or specific element screenshots
- 🖱️ Click Actions: Click on elements using CSS selectors
- ⌨️ Text Input: Type text into input fields and forms
- 📜 Scrolling: Scroll pages in any direction
- ⏳ Wait Operations: Wait for elements to appear
- 📊 Page Info: Get current page title, URL, and viewport information
- 🎨 AI Image Generation: Generate custom images using Google Gemini 2.0 Flash Preview
Installation
-
Clone/Download this repository to your local machine
-
Install dependencies:
npm install -
Build the project:
npm run build -
Set up environment variables (for image generation): Create a
.envfile in the project root:GOOGLE_AI_KEY=your_google_ai_api_key_hereGet your API key from Google AI Studio.
Usage with Cursor
Step 1: Start the HTTP Server
The recommended way to use this MCP server is through HTTP mode to avoid ES module compatibility issues:
npm run start:http
This starts the server on http://localhost:3045 and keeps it running.
Step 2: Configure MCP in Cursor
Create or update your cursor-mcp-config.json file with the following configuration:
{
"mcpServers": {
"local-browser": {
"command": "node",
"args": ["/absolute/path/to/your/project/mcp-http-bridge.js"],
"env": {
"NODE_ENV": "production"
}
}
}
}
Important:
- Replace
/absolute/path/to/your/project/with the actual absolute path to your project directory - The HTTP bridge (
mcp-http-bridge.js) routes MCP requests to the HTTP server - Make sure the HTTP server is running before using the tools in Cursor
Step 3: Restart Cursor
After updating the configuration, restart Cursor to load the new MCP server.
Step 4: Start Using the Tools
Once configured, you can use the following tools in Cursor:
Browser Navigation
navigate_to_url- Navigate to any URLgo_back- Go back in browser historygo_forward- Go forward in browser historyrefresh_page- Refresh the current page
Screenshots & Visual Capture
take_screenshot- Capture screenshots (full page, viewport, or specific elements)
Page Interactions
click_element- Click on elements using CSS selectorstype_text- Type text into input fieldsscroll_page- Scroll the page in any directionwait_for_element- Wait for elements to appear
Page Information
get_page_info- Get current page title, URL, and viewport info
AI Image Generation
generate_image- Generate custom images using Google Gemini 2.0 Flash Preview
Quick Start Example
Here's how to get started quickly:
-
Start the HTTP server:
npm run start:http -
In Cursor, try these commands:
- "Navigate to google.com and take a screenshot"
- "Generate an image of a sunset over mountains"
- "Click on the search button and type 'hello world'"
Example Workflow
Here's a typical workflow when testing a website you've created:
-
Navigate to your local development server:
Use navigate_to_url with "http://localhost:3000" -
Take a screenshot to see the current state:
Use take_screenshot to capture the full page -
Interact with your website:
Use click_element to click buttons Use type_text to fill out forms Use scroll_page to test scrolling behavior -
Capture results:
Use take_screenshot again to see changes
Tool Reference
navigate_to_url
Navigate the browser to a specific URL.
url(required): The URL to navigate to
take_screenshot
Take a screenshot of the current page.
fullPage(optional): Capture full page vs viewport onlyselector(optional): CSS selector to screenshot specific element
click_element
Click on an element specified by CSS selector.
selector(required): CSS selector of element to clickwaitFor(optional): Milliseconds to wait after clicking (default: 1000)
type_text
Type text into an input field.
selector(required): CSS selector of input elementtext(required): Text to typeclear(optional): Clear field before typing (default: true)
wait_for_element
Wait for an element to appear on the page.
selector(required): CSS selector to wait fortimeout(optional): Timeout in milliseconds (default: 5000)
scroll_page
Scroll the page.
direction(required): 'up', 'down', 'top', or 'bottom'amount(optional): Pixels to scroll for up/down (default: 500)
get_page_info
Get information about the current page (title, URL, viewport size).
refresh_page
Refresh the current page.
go_back
Navigate back in browser history.
go_forward
Navigate forward in browser history.
generate_image
Generate custom AI images using Google Gemini 2.0 Flash Preview.
description(required): Text description of the image to generate
Generated images are automatically saved to the generated-images/ directory and can be downloaded via HTTP at http://localhost:3045/download/{filename}.
Development
- Build:
npm run build - Development mode:
npm run dev(watches for changes) - Start:
npm start(visible browser) ornpm run start:headless(background) - HTTP Test Server:
npm run start:http(visible) ornpm run start:http:headless(background)
Browser Behavior
- Visible Mode: Browser window opens so you can see what's happening
- Headless Mode: Browser runs in background (set
MCP_HEADLESS=true) - Separate Profile: Uses
/tmp/chrome-mcp-datato avoid conflicts with your main Chrome - Default viewport: 1280x720 pixels
- Screenshots: Returned as base64-encoded PNG images
Troubleshooting
Browser doesn't launch
- Ensure Chrome is installed on your system
- Check that no other processes are blocking Chrome
- Try restarting the HTTP server:
npm run start:http - Clear Chrome data directory:
rm -rf /private/tmp/chrome-mcp-data
Elements not found
- Verify CSS selectors are correct
- Use browser dev tools to test selectors
- Try waiting for elements to load with
wait_for_element
MCP Tools not available in Cursor
- Ensure the HTTP server is running:
npm run start:http - Check that
cursor-mcp-config.jsonhas the correct absolute path - Restart Cursor after configuration changes
- Verify the HTTP bridge file exists:
mcp-http-bridge.js
Image generation not working
- Ensure
GOOGLE_AI_KEYis set in your.envfile - Get your API key from Google AI Studio
- Check that the HTTP server is running (image generation requires HTTP mode)
Permission issues
- Ensure the MCP server has permission to launch Chrome
- Check file permissions on the built JavaScript files
- On macOS, you may need to allow Chrome in System Preferences > Security & Privacy
Security Notes
- This server launches a real browser with full system access
- Only use with trusted websites and content
- The browser runs with some security features disabled for automation
- Always run in a controlled environment
License
MIT License - see LICENSE file for details.