puppeteer-mcp-server

axo-mvo/puppeteer-mcp-server

3.2

If you are the rightful owner of puppeteer-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Puppeteer MCP Server is a headless Chrome automation server built with Next.js and Puppeteer, designed to run on Vercel with multiple cloud browser options.

Tools
5
Resources
0
Prompts
0

Puppeteer MCP Server

A headless Chrome automation server built with Next.js and Puppeteer, designed to run on Vercel with multiple cloud browser options. This project provides RESTful API endpoints for web scraping, screenshot capture, PDF generation, and performance monitoring.

šŸ”— GitHub Repository: https://github.com/axo-mvo/puppeteer-mcp-server
🌐 Live Demo: https://puppeteer-mcp-server.vercel.app
šŸŽ® Interactive Demo: https://puppeteer-mcp-server.vercel.app/demo

Quick Test

Try the live API right now:

# Take a screenshot
curl "https://puppeteer-mcp-server.vercel.app/api/screenshot-chromium?url=https://example.com" -o screenshot.png

# Extract text content
curl -X POST "https://puppeteer-mcp-server.vercel.app/api/scrape" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","action":"text"}'

Project Structure

puppeteer-vercel/
ā”œā”€ā”€ app/
│   ā”œā”€ā”€ api/
│   │   ā”œā”€ā”€ screenshot-browserless/     # Browserless.io API endpoint
│   │   │   └── route.ts
│   │   ā”œā”€ā”€ screenshot-chromium/        # @sparticuz/chromium API endpoint
│   │   │   └── route.ts
│   │   ā”œā”€ā”€ scrape/                     # Advanced scraping API endpoint
│   │   │   └── route.ts
│   │   └── mcp/                        # Model Context Protocol endpoint
│   │       └── route.ts
│   ā”œā”€ā”€ demo/                           # Interactive demo page
│   │   └── page.tsx
│   ā”œā”€ā”€ page.tsx                        # Main landing page
│   ā”œā”€ā”€ layout.tsx                      # Root layout
│   └── globals.css                     # Global styles
ā”œā”€ā”€ package.json                        # Dependencies and scripts
ā”œā”€ā”€ next.config.ts                      # Next.js configuration
ā”œā”€ā”€ tailwind.config.ts                  # Tailwind CSS configuration
└── README.md                           # Project documentation

Architectural Decisions

Browser Execution Options

The project implements two approaches for running headless Chrome in serverless environments:

  1. @sparticuz/chromium (/api/screenshot-chromium)

    • Self-contained Chromium binary packaged with the function
    • No external dependencies required
    • Larger bundle size but more reliable
    • Works within Vercel's function size limits
  2. Browserless.io (/api/screenshot-browserless)

    • Connects to managed headless Chrome instances in the cloud
    • Requires BLESS_TOKEN environment variable
    • Smaller function size
    • Better for high-volume usage

API Design

  • RESTful endpoints for different functionality
  • TypeScript interfaces for request/response validation
  • Comprehensive error handling with proper HTTP status codes
  • Caching headers for improved performance
  • Flexible options for customization

Setup Instructions

Prerequisites

  • Node.js 18+
  • npm, yarn, or pnpm

Local Development

  1. Clone and install dependencies:

    git clone https://github.com/axo-mvo/puppeteer-mcp-server.git
    cd puppeteer-mcp-server
    npm install
    
  2. Set up environment variables (for Browserless.io): Create .env.local file:

    BLESS_TOKEN=your_browserless_api_token_here
    
  3. Run the development server:

    npm run dev
    
  4. Access the application:

Deployment to Vercel

  1. Push to GitHub:

    git init
    git add .
    git commit -m "Initial commit"
    git remote add origin <your-github-repo-url>
    git push -u origin main
    
  2. Deploy to Vercel:

    • Import your GitHub repository in Vercel dashboard
    • Add environment variables if using Browserless.io:
      • BLESS_TOKEN: Your Browserless.io API token
  3. Configure function timeout (optional): Add to vercel.json:

    {
      "functions": {
        "app/api/**/route.ts": {
          "maxDuration": 30
        }
      }
    }
    

MCP (Model Context Protocol) Support

This server now supports the Model Context Protocol (MCP), allowing it to be used as a tool server in AI assistants like Cursor, Claude Desktop, and other MCP-compatible clients.

MCP Endpoints

URL: /api/mcp (Full-featured for Cursor, Claude Desktop, etc.) URL: /api/mcp-compliant (ChatGPT-compliant version with restrictions)

The MCP endpoint provides the following tools:

Tool NameDescriptionParameters
take_screenshotTake a screenshot of a web pageurl, fullPage?, viewport?, waitFor?
generate_pdfGenerate a PDF from a web pageurl, format?, landscape?, waitFor?
extract_textExtract text content from a web pageurl, selector?, waitFor?
extract_htmlExtract HTML content from a web pageurl, selector?, waitFor?
get_performance_metricsGet performance metrics for a web pageurl, waitFor?

Adding to AI Assistants

For Cursor/Claude Desktop (Full Version)

Add to your ~/.cursor/mcp.json:

{
  "mcpServers": {
    "puppeteer-mcp": {
      "url": "https://puppeteer-mcp-server.vercel.app/api/mcp",
      "transport": "sse"
    }
  }
}
For ChatGPT (Compliant Version)

Use the compliant endpoint with restricted functionality:

  1. Go to ChatGPT Settings → Connectors → Create
  2. Add MCP Server URL: https://puppeteer-mcp-server.vercel.app/api/mcp-compliant
  3. Only tools named search and retrieve are available (OpenAI requirement)
  4. Access is limited to approved domains for security

Using MCP Tools

Once configured, you can use natural language commands in Cursor:

  • "Take a screenshot of example.com"
  • "Generate a PDF of the homepage"
  • "Extract the title text from this website"
  • "Get performance metrics for this URL"

API Endpoints

Screenshot APIs

GET /api/screenshot-chromium

Takes screenshots using @sparticuz/chromium.

Parameters:

  • url (required): URL to capture

Example:

curl "https://your-app.vercel.app/api/screenshot-chromium?url=https://example.com"
GET /api/screenshot-browserless

Takes screenshots using Browserless.io.

Parameters:

  • url (required): URL to capture

Requires: BLESS_TOKEN environment variable

Example:

curl "https://your-app.vercel.app/api/screenshot-browserless?url=https://example.com"

Advanced Scraping API

POST /api/scrape

Comprehensive scraping endpoint with multiple actions.

Request Body:

{
  url: string;
  action: 'screenshot' | 'pdf' | 'text' | 'html' | 'performance';
  options?: {
    fullPage?: boolean;
    selector?: string;
    format?: 'A4' | 'Letter';
    landscape?: boolean;
    waitFor?: number;
    viewport?: {
      width: number;
      height: number;
    };
  };
}

Examples:

Screenshot with custom viewport:

curl -X POST "https://your-app.vercel.app/api/scrape" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "action": "screenshot",
    "options": {
      "fullPage": true,
      "viewport": {"width": 1200, "height": 800}
    }
  }'

Generate PDF:

curl -X POST "https://your-app.vercel.app/api/scrape" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "action": "pdf",
    "options": {
      "format": "A4",
      "landscape": false
    }
  }'

Extract text content:

curl -X POST "https://your-app.vercel.app/api/scrape" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "action": "text",
    "options": {
      "selector": "h1"
    }
  }'

Environment Variables

VariableRequiredDescription
BLESS_TOKENNo*Browserless.io API token (*required for browserless endpoints)

Features

  • āœ… Screenshot capture (full page or viewport)
  • āœ… PDF generation from web pages
  • āœ… Text and HTML content extraction
  • āœ… Performance metrics collection
  • āœ… Custom viewport and wait options
  • āœ… RESTful API endpoints
  • āœ… MCP (Model Context Protocol) support
  • āœ… TypeScript support
  • āœ… Interactive demo interface
  • āœ… Multiple browser execution options
  • āœ… Comprehensive error handling
  • āœ… AI assistant integration (Cursor, Claude Desktop)

Limitations

  • Function timeout: Vercel free tier has 10-second timeout
  • Function size: @sparticuz/chromium adds ~50MB to function
  • Memory usage: Chrome requires significant memory allocation
  • Concurrent executions: Consider rate limiting for production use

Troubleshooting

Common Issues

  1. Function timeout errors (FUNCTION_INVOCATION_TIMEOUT):

    • Fixed in latest version: Optimized Chrome args and faster page loading
    • Uses domcontentloaded instead of networkidle0 for faster loading
    • Increased function timeout to 60 seconds in vercel.json
    • Added proper browser cleanup to prevent memory leaks
  2. Memory errors:

    • Close browser instances properly (handled automatically)
    • Avoid concurrent requests to same function
    • Function memory set to 1024MB in configuration
  3. Browserless.io connection errors:

    • Verify BLESS_TOKEN is correctly set
    • Check Browserless.io service status

Performance Optimizations

The latest version includes several Vercel-specific optimizations:

  • Faster Chrome startup with optimized arguments
  • Reduced page load time using domcontentloaded
  • Better error handling with automatic browser cleanup
  • Increased timeouts for complex pages (60s function timeout)
  • Memory optimization with single-process Chrome mode

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make changes and test locally
  4. Submit a pull request

Change Log

  • 2025-06-05: Initial project setup with Next.js 15 and Puppeteer integration
  • 2025-06-05: Added @sparticuz/chromium support for self-contained browser execution
  • 2025-06-05: Implemented Browserless.io integration for cloud browser instances
  • 2025-06-05: Created comprehensive scraping API with multiple action types
  • 2025-06-05: Built interactive demo interface with real-time testing
  • 2025-06-05: Added TypeScript interfaces and comprehensive error handling
  • 2025-06-05: Updated main landing page with feature overview and setup instructions
  • 2025-06-05: Fixed Vercel timeout issues with optimized Chrome args and faster loading
  • 2025-06-05: Increased function timeout to 60s and improved error handling
  • 2025-06-05: Added production URL storage and comprehensive testing suite
  • 2025-06-05: Fixed file download issue - PDFs and images now download with proper extensions (.pdf, .png) instead of generic .file extension
  • 2025-06-05: Added Model Context Protocol (MCP) support with comprehensive tool definitions for AI assistant integration
  • 2025-06-05: Created ChatGPT-compliant MCP endpoint with security restrictions and approved domain whitelist

Testing

Production Testing

Run the comprehensive test suite:

./test-production.sh

Or test individual endpoints:

# Health check
npm run test:prod

# Screenshot test
npm run test:screenshot

# Open demo page
npm run test:demo

Manual Testing

Live Endpoints: