local-browser-mcp-server

botdojo-ai/local-browser-mcp-server

3.2

If you are the rightful owner of local-browser-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Local Browser MCP Server is a versatile tool for browser automation and AI image generation, leveraging Puppeteer and Google Gemini 2.0 Flash.

Tools
5
Resources
0
Prompts
0

Local Browser MCP Server

A Model Context Protocol (MCP) server that provides browser automation capabilities using Puppeteer, plus AI image generation using Google Gemini 2.0 Flash. This server allows you to control a local Chrome browser instance, take screenshots, perform interactions like clicking and typing, and generate custom images - perfect for testing websites and creating content with Cursor.

Features

  • 🌐 Browser Navigation: Navigate to URLs, go back/forward, refresh pages
  • 📸 Screenshots: Capture full page, viewport, or specific element screenshots
  • 🖱️ Click Actions: Click on elements using CSS selectors
  • ⌨️ Text Input: Type text into input fields and forms
  • 📜 Scrolling: Scroll pages in any direction
  • Wait Operations: Wait for elements to appear
  • 📊 Page Info: Get current page title, URL, and viewport information
  • 🎨 AI Image Generation: Generate custom images using Google Gemini 2.0 Flash Preview

Installation

  1. Clone/Download this repository to your local machine

  2. Install dependencies:

    npm install
    
  3. Build the project:

    npm run build
    
  4. Set up environment variables (for image generation): Create a .env file in the project root:

    GOOGLE_AI_KEY=your_google_ai_api_key_here
    

    Get your API key from Google AI Studio.

Usage with Cursor

Step 1: Start the HTTP Server

The recommended way to use this MCP server is through HTTP mode to avoid ES module compatibility issues:

npm run start:http

This starts the server on http://localhost:3045 and keeps it running.

Step 2: Configure MCP in Cursor

Create or update your cursor-mcp-config.json file with the following configuration:

{
  "mcpServers": {
    "local-browser": {
      "command": "node",
      "args": ["/absolute/path/to/your/project/mcp-http-bridge.js"],
      "env": {
        "NODE_ENV": "production"
      }
    }
  }
}

Important:

  • Replace /absolute/path/to/your/project/ with the actual absolute path to your project directory
  • The HTTP bridge (mcp-http-bridge.js) routes MCP requests to the HTTP server
  • Make sure the HTTP server is running before using the tools in Cursor

Step 3: Restart Cursor

After updating the configuration, restart Cursor to load the new MCP server.

Step 4: Start Using the Tools

Once configured, you can use the following tools in Cursor:

Browser Navigation
  • navigate_to_url - Navigate to any URL
  • go_back - Go back in browser history
  • go_forward - Go forward in browser history
  • refresh_page - Refresh the current page
Screenshots & Visual Capture
  • take_screenshot - Capture screenshots (full page, viewport, or specific elements)
Page Interactions
  • click_element - Click on elements using CSS selectors
  • type_text - Type text into input fields
  • scroll_page - Scroll the page in any direction
  • wait_for_element - Wait for elements to appear
Page Information
  • get_page_info - Get current page title, URL, and viewport info
AI Image Generation
  • generate_image - Generate custom images using Google Gemini 2.0 Flash Preview

Quick Start Example

Here's how to get started quickly:

  1. Start the HTTP server:

    npm run start:http
    
  2. In Cursor, try these commands:

    • "Navigate to google.com and take a screenshot"
    • "Generate an image of a sunset over mountains"
    • "Click on the search button and type 'hello world'"

Example Workflow

Here's a typical workflow when testing a website you've created:

  1. Navigate to your local development server:

    Use navigate_to_url with "http://localhost:3000"
    
  2. Take a screenshot to see the current state:

    Use take_screenshot to capture the full page
    
  3. Interact with your website:

    Use click_element to click buttons
    Use type_text to fill out forms
    Use scroll_page to test scrolling behavior
    
  4. Capture results:

    Use take_screenshot again to see changes
    

Tool Reference

navigate_to_url

Navigate the browser to a specific URL.

  • url (required): The URL to navigate to

take_screenshot

Take a screenshot of the current page.

  • fullPage (optional): Capture full page vs viewport only
  • selector (optional): CSS selector to screenshot specific element

click_element

Click on an element specified by CSS selector.

  • selector (required): CSS selector of element to click
  • waitFor (optional): Milliseconds to wait after clicking (default: 1000)

type_text

Type text into an input field.

  • selector (required): CSS selector of input element
  • text (required): Text to type
  • clear (optional): Clear field before typing (default: true)

wait_for_element

Wait for an element to appear on the page.

  • selector (required): CSS selector to wait for
  • timeout (optional): Timeout in milliseconds (default: 5000)

scroll_page

Scroll the page.

  • direction (required): 'up', 'down', 'top', or 'bottom'
  • amount (optional): Pixels to scroll for up/down (default: 500)

get_page_info

Get information about the current page (title, URL, viewport size).

refresh_page

Refresh the current page.

go_back

Navigate back in browser history.

go_forward

Navigate forward in browser history.

generate_image

Generate custom AI images using Google Gemini 2.0 Flash Preview.

  • description (required): Text description of the image to generate

Generated images are automatically saved to the generated-images/ directory and can be downloaded via HTTP at http://localhost:3045/download/{filename}.

Development

  • Build: npm run build
  • Development mode: npm run dev (watches for changes)
  • Start: npm start (visible browser) or npm run start:headless (background)
  • HTTP Test Server: npm run start:http (visible) or npm run start:http:headless (background)

Browser Behavior

  • Visible Mode: Browser window opens so you can see what's happening
  • Headless Mode: Browser runs in background (set MCP_HEADLESS=true)
  • Separate Profile: Uses /tmp/chrome-mcp-data to avoid conflicts with your main Chrome
  • Default viewport: 1280x720 pixels
  • Screenshots: Returned as base64-encoded PNG images

Troubleshooting

Browser doesn't launch

  • Ensure Chrome is installed on your system
  • Check that no other processes are blocking Chrome
  • Try restarting the HTTP server: npm run start:http
  • Clear Chrome data directory: rm -rf /private/tmp/chrome-mcp-data

Elements not found

  • Verify CSS selectors are correct
  • Use browser dev tools to test selectors
  • Try waiting for elements to load with wait_for_element

MCP Tools not available in Cursor

  • Ensure the HTTP server is running: npm run start:http
  • Check that cursor-mcp-config.json has the correct absolute path
  • Restart Cursor after configuration changes
  • Verify the HTTP bridge file exists: mcp-http-bridge.js

Image generation not working

  • Ensure GOOGLE_AI_KEY is set in your .env file
  • Get your API key from Google AI Studio
  • Check that the HTTP server is running (image generation requires HTTP mode)

Permission issues

  • Ensure the MCP server has permission to launch Chrome
  • Check file permissions on the built JavaScript files
  • On macOS, you may need to allow Chrome in System Preferences > Security & Privacy

Security Notes

  • This server launches a real browser with full system access
  • Only use with trusted websites and content
  • The browser runs with some security features disabled for automation
  • Always run in a controlled environment

License

MIT License - see LICENSE file for details.