local-browser-mcp-server by botdojo-ai - MCP Server

Local Browser MCP Server

A Model Context Protocol (MCP) server that provides browser automation capabilities using Puppeteer, plus AI image generation using Google Gemini 2.0 Flash. This server allows you to control a local Chrome browser instance, take screenshots, perform interactions like clicking and typing, and generate custom images - perfect for testing websites and creating content with Cursor.

Features

🌐 Browser Navigation: Navigate to URLs, go back/forward, refresh pages
📸 Screenshots: Capture full page, viewport, or specific element screenshots
🖱️ Click Actions: Click on elements using CSS selectors
⌨️ Text Input: Type text into input fields and forms
📜 Scrolling: Scroll pages in any direction
⏳ Wait Operations: Wait for elements to appear
📊 Page Info: Get current page title, URL, and viewport information
🎨 AI Image Generation: Generate custom images using Google Gemini 2.0 Flash Preview

Installation

Clone/Download this repository to your local machine
Install dependencies:
```
npm install
```
Build the project:
```
npm run build
```
Set up environment variables (for image generation): Create a .env file in the project root:
```
GOOGLE_AI_KEY=your_google_ai_api_key_here
```
Get your API key from Google AI Studio.

Usage with Cursor

Step 1: Start the HTTP Server

The recommended way to use this MCP server is through HTTP mode to avoid ES module compatibility issues:

npm run start:http

This starts the server on http://localhost:3045 and keeps it running.

Step 2: Configure MCP in Cursor

Create or update your cursor-mcp-config.json file with the following configuration:

{
  "mcpServers": {
    "local-browser": {
      "command": "node",
      "args": ["/absolute/path/to/your/project/mcp-http-bridge.js"],
      "env": {
        "NODE_ENV": "production"
      }
    }
  }
}

Important:

Replace /absolute/path/to/your/project/ with the actual absolute path to your project directory
The HTTP bridge (mcp-http-bridge.js) routes MCP requests to the HTTP server
Make sure the HTTP server is running before using the tools in Cursor

Step 3: Restart Cursor

After updating the configuration, restart Cursor to load the new MCP server.

Step 4: Start Using the Tools

Once configured, you can use the following tools in Cursor:

Browser Navigation

navigate_to_url - Navigate to any URL
go_back - Go back in browser history
go_forward - Go forward in browser history
refresh_page - Refresh the current page

Screenshots & Visual Capture

take_screenshot - Capture screenshots (full page, viewport, or specific elements)

Page Interactions

click_element - Click on elements using CSS selectors
type_text - Type text into input fields
scroll_page - Scroll the page in any direction
wait_for_element - Wait for elements to appear

Page Information

get_page_info - Get current page title, URL, and viewport info

AI Image Generation

generate_image - Generate custom images using Google Gemini 2.0 Flash Preview

Quick Start Example

Here's how to get started quickly:

Start the HTTP server:
```
npm run start:http
```
In Cursor, try these commands:
- "Navigate to google.com and take a screenshot"
- "Generate an image of a sunset over mountains"
- "Click on the search button and type 'hello world'"

Example Workflow

Here's a typical workflow when testing a website you've created:

Navigate to your local development server:

Use navigate_to_url with "http://localhost:3000"

Take a screenshot to see the current state:

Use take_screenshot to capture the full page

Interact with your website:

Use click_element to click buttons
Use type_text to fill out forms
Use scroll_page to test scrolling behavior

Capture results:

Use take_screenshot again to see changes

Tool Reference

navigate_to_url

Navigate the browser to a specific URL.

url (required): The URL to navigate to

take_screenshot

Take a screenshot of the current page.

fullPage (optional): Capture full page vs viewport only
selector (optional): CSS selector to screenshot specific element

click_element

Click on an element specified by CSS selector.

selector (required): CSS selector of element to click
waitFor (optional): Milliseconds to wait after clicking (default: 1000)

type_text

Type text into an input field.

selector (required): CSS selector of input element
text (required): Text to type
clear (optional): Clear field before typing (default: true)

wait_for_element

Wait for an element to appear on the page.

selector (required): CSS selector to wait for
timeout (optional): Timeout in milliseconds (default: 5000)

scroll_page

Scroll the page.

direction (required): 'up', 'down', 'top', or 'bottom'
amount (optional): Pixels to scroll for up/down (default: 500)

get_page_info

Get information about the current page (title, URL, viewport size).

refresh_page

Refresh the current page.

go_back

Navigate back in browser history.

go_forward

Navigate forward in browser history.

generate_image

Generate custom AI images using Google Gemini 2.0 Flash Preview.

description (required): Text description of the image to generate

Generated images are automatically saved to the generated-images/ directory and can be downloaded via HTTP at http://localhost:3045/download/{filename}.

Development

Build: npm run build
Development mode: npm run dev (watches for changes)
Start: npm start (visible browser) or npm run start:headless (background)
HTTP Test Server: npm run start:http (visible) or npm run start:http:headless (background)

Browser Behavior

Visible Mode: Browser window opens so you can see what's happening
Headless Mode: Browser runs in background (set MCP_HEADLESS=true)
Separate Profile: Uses /tmp/chrome-mcp-data to avoid conflicts with your main Chrome
Default viewport: 1280x720 pixels
Screenshots: Returned as base64-encoded PNG images

Troubleshooting

Browser doesn't launch

Ensure Chrome is installed on your system
Check that no other processes are blocking Chrome
Try restarting the HTTP server: npm run start:http
Clear Chrome data directory: rm -rf /private/tmp/chrome-mcp-data

Elements not found

Verify CSS selectors are correct
Use browser dev tools to test selectors
Try waiting for elements to load with wait_for_element

MCP Tools not available in Cursor

Ensure the HTTP server is running: npm run start:http
Check that cursor-mcp-config.json has the correct absolute path
Restart Cursor after configuration changes
Verify the HTTP bridge file exists: mcp-http-bridge.js

Image generation not working

Ensure GOOGLE_AI_KEY is set in your .env file
Get your API key from Google AI Studio
Check that the HTTP server is running (image generation requires HTTP mode)

Permission issues

Ensure the MCP server has permission to launch Chrome
Check file permissions on the built JavaScript files
On macOS, you may need to allow Chrome in System Preferences > Security & Privacy

Security Notes

This server launches a real browser with full system access
Only use with trusted websites and content
The browser runs with some security features disabled for automation
Always run in a controlled environment

License

MIT License - see LICENSE file for details.