webcrawlerapi-mcp

WebCrawlerAPI/webcrawlerapi-mcp

3.2

If you are the rightful owner of webcrawlerapi-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

WebcrawlerAPI MCP Server integrates WebcrawlerAPI's web scraping capabilities into MCP-compatible applications, enabling seamless data extraction within AI workflows.

Tools
  1. webcrawler-scrape

    Scrapes content from any webpage and returns it as markdown.

WebcrawlerAPI MCP Server

WebcrawlerAPI is a powerful web scraping and website crawling service that allows you to extract data from any website with ease.

This MCP (Model Context Protocol) server integrates WebcrawlerAPI's scraping capabilities directly into MCP-compatible applications like Claude Code, enabling integration within your AI workflows. Get started with your free API key at the WebcrawlerAPI Dashboard.

Setup

  1. Get your API key from WebcrawlerAPI Dashboard
  2. Set the environment variable:
    export WEBCRAWLER_API_KEY="your-api-key-here"
    

Using with Claude Code

Add this server to your Claude Code configuration using npx:

npx webcrawler-mcp

Or add to your MCP settings configuration:

{
  "mcpServers": {
    "webcrawler": {
      "command": "npx",
      "args": ["-y", "webcrawler-mcp"],
      "env": {
        "WEBCRAWLER_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

webcrawler-scrape

Scrapes content from any webpage and returns it as markdown.

Parameters:

  • url (required): The URL of the webpage to scrape
  • prompt (optional): Optional prompt to extract specific information from the page

Example usage:

Use the webcrawler-scrape tool to get the content from https://example.com

With a prompt for targeted extraction:

Use the webcrawler-scrape tool to scrape https://news.ycombinator.com with the prompt "Extract all article titles and their corresponding URLs"

The tool returns the scraped content as markdown, along with the page title and HTTP status code. When using a prompt, the API will focus on extracting the specific information you requested.

Running as Standalone App

You can also run the server as a standalone application:

Via npm

npm install
npm run build
npm start

Via npx

npx webcrawler-mcp

HTTP Mode

To run in HTTP mode instead of stdio:

USE_HTTP=true npx webcrawler-mcp
# or
npm run start:http

The HTTP server will start on port 8080 by default, with endpoints:

  • Health check: http://localhost:8080/health
  • MCP endpoint: http://localhost:8080/mcp

Environment Variables

  • WEBCRAWLER_API_KEY: Your WebcrawlerAPI.com API key (required)
  • USE_HTTP: Set to "true" to use HTTP transport instead of stdio
  • PORT: HTTP server port (default: 8080)

Development

npm install
npm run dev  # Watch mode
npm run build