botdojo-ai/local-browser-mcp-server
If you are the rightful owner of local-browser-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Local Browser MCP Server is a versatile tool for browser automation and AI image generation, leveraging Puppeteer and Google Gemini 2.0 Flash.
Local Browser MCP Server
A Model Context Protocol (MCP) server that provides browser automation capabilities using Puppeteer, plus AI image generation using Google Gemini 2.0 Flash. This server allows you to control a local Chrome browser instance, take screenshots, perform interactions like clicking and typing, and generate custom images - perfect for testing websites and creating content with Cursor.
Features
- 🌐 Browser Navigation: Navigate to URLs, go back/forward, refresh pages
- 📸 Screenshots: Capture full page, viewport, or specific element screenshots
- 🖱️ Click Actions: Click on elements using CSS selectors
- ⌨️ Text Input: Type text into input fields and forms
- 📜 Scrolling: Scroll pages in any direction
- ⏳ Wait Operations: Wait for elements to appear
- 📊 Page Info: Get current page title, URL, and viewport information
- 🎨 AI Image Generation: Generate custom images using Google Gemini 2.0 Flash Preview
Installation
-
Clone/Download this repository to your local machine
-
Install dependencies:
npm install
-
Build the project:
npm run build
-
Set up environment variables (for image generation): Create a
.env
file in the project root:GOOGLE_AI_KEY=your_google_ai_api_key_here
Get your API key from Google AI Studio.
Usage with Cursor
Step 1: Start the HTTP Server
The recommended way to use this MCP server is through HTTP mode to avoid ES module compatibility issues:
npm run start:http
This starts the server on http://localhost:3045
and keeps it running.
Step 2: Configure MCP in Cursor
Create or update your cursor-mcp-config.json
file with the following configuration:
{
"mcpServers": {
"local-browser": {
"command": "node",
"args": ["/absolute/path/to/your/project/mcp-http-bridge.js"],
"env": {
"NODE_ENV": "production"
}
}
}
}
Important:
- Replace
/absolute/path/to/your/project/
with the actual absolute path to your project directory - The HTTP bridge (
mcp-http-bridge.js
) routes MCP requests to the HTTP server - Make sure the HTTP server is running before using the tools in Cursor
Step 3: Restart Cursor
After updating the configuration, restart Cursor to load the new MCP server.
Step 4: Start Using the Tools
Once configured, you can use the following tools in Cursor:
Browser Navigation
navigate_to_url
- Navigate to any URLgo_back
- Go back in browser historygo_forward
- Go forward in browser historyrefresh_page
- Refresh the current page
Screenshots & Visual Capture
take_screenshot
- Capture screenshots (full page, viewport, or specific elements)
Page Interactions
click_element
- Click on elements using CSS selectorstype_text
- Type text into input fieldsscroll_page
- Scroll the page in any directionwait_for_element
- Wait for elements to appear
Page Information
get_page_info
- Get current page title, URL, and viewport info
AI Image Generation
generate_image
- Generate custom images using Google Gemini 2.0 Flash Preview
Quick Start Example
Here's how to get started quickly:
-
Start the HTTP server:
npm run start:http
-
In Cursor, try these commands:
- "Navigate to google.com and take a screenshot"
- "Generate an image of a sunset over mountains"
- "Click on the search button and type 'hello world'"
Example Workflow
Here's a typical workflow when testing a website you've created:
-
Navigate to your local development server:
Use navigate_to_url with "http://localhost:3000"
-
Take a screenshot to see the current state:
Use take_screenshot to capture the full page
-
Interact with your website:
Use click_element to click buttons Use type_text to fill out forms Use scroll_page to test scrolling behavior
-
Capture results:
Use take_screenshot again to see changes
Tool Reference
navigate_to_url
Navigate the browser to a specific URL.
url
(required): The URL to navigate to
take_screenshot
Take a screenshot of the current page.
fullPage
(optional): Capture full page vs viewport onlyselector
(optional): CSS selector to screenshot specific element
click_element
Click on an element specified by CSS selector.
selector
(required): CSS selector of element to clickwaitFor
(optional): Milliseconds to wait after clicking (default: 1000)
type_text
Type text into an input field.
selector
(required): CSS selector of input elementtext
(required): Text to typeclear
(optional): Clear field before typing (default: true)
wait_for_element
Wait for an element to appear on the page.
selector
(required): CSS selector to wait fortimeout
(optional): Timeout in milliseconds (default: 5000)
scroll_page
Scroll the page.
direction
(required): 'up', 'down', 'top', or 'bottom'amount
(optional): Pixels to scroll for up/down (default: 500)
get_page_info
Get information about the current page (title, URL, viewport size).
refresh_page
Refresh the current page.
go_back
Navigate back in browser history.
go_forward
Navigate forward in browser history.
generate_image
Generate custom AI images using Google Gemini 2.0 Flash Preview.
description
(required): Text description of the image to generate
Generated images are automatically saved to the generated-images/
directory and can be downloaded via HTTP at http://localhost:3045/download/{filename}
.
Development
- Build:
npm run build
- Development mode:
npm run dev
(watches for changes) - Start:
npm start
(visible browser) ornpm run start:headless
(background) - HTTP Test Server:
npm run start:http
(visible) ornpm run start:http:headless
(background)
Browser Behavior
- Visible Mode: Browser window opens so you can see what's happening
- Headless Mode: Browser runs in background (set
MCP_HEADLESS=true
) - Separate Profile: Uses
/tmp/chrome-mcp-data
to avoid conflicts with your main Chrome - Default viewport: 1280x720 pixels
- Screenshots: Returned as base64-encoded PNG images
Troubleshooting
Browser doesn't launch
- Ensure Chrome is installed on your system
- Check that no other processes are blocking Chrome
- Try restarting the HTTP server:
npm run start:http
- Clear Chrome data directory:
rm -rf /private/tmp/chrome-mcp-data
Elements not found
- Verify CSS selectors are correct
- Use browser dev tools to test selectors
- Try waiting for elements to load with
wait_for_element
MCP Tools not available in Cursor
- Ensure the HTTP server is running:
npm run start:http
- Check that
cursor-mcp-config.json
has the correct absolute path - Restart Cursor after configuration changes
- Verify the HTTP bridge file exists:
mcp-http-bridge.js
Image generation not working
- Ensure
GOOGLE_AI_KEY
is set in your.env
file - Get your API key from Google AI Studio
- Check that the HTTP server is running (image generation requires HTTP mode)
Permission issues
- Ensure the MCP server has permission to launch Chrome
- Check file permissions on the built JavaScript files
- On macOS, you may need to allow Chrome in System Preferences > Security & Privacy
Security Notes
- This server launches a real browser with full system access
- Only use with trusted websites and content
- The browser runs with some security features disabled for automation
- Always run in a controlled environment
License
MIT License - see LICENSE file for details.