WebCrawlerAPI/webcrawlerapi-mcp
If you are the rightful owner of webcrawlerapi-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
WebcrawlerAPI MCP Server integrates WebcrawlerAPI's web scraping capabilities into MCP-compatible applications, enabling seamless data extraction within AI workflows.
webcrawler-scrape
Scrapes content from any webpage and returns it as markdown.
WebcrawlerAPI MCP Server
WebcrawlerAPI is a powerful web scraping and website crawling service that allows you to extract data from any website with ease.
This MCP (Model Context Protocol) server integrates WebcrawlerAPI's scraping capabilities directly into MCP-compatible applications like Claude Code, enabling integration within your AI workflows. Get started with your free API key at the WebcrawlerAPI Dashboard.
Setup
- Get your API key from WebcrawlerAPI Dashboard
- Set the environment variable:
export WEBCRAWLER_API_KEY="your-api-key-here"
Using with Claude Code
Add this server to your Claude Code configuration using npx:
npx webcrawler-mcp
Or add to your MCP settings configuration:
{
"mcpServers": {
"webcrawler": {
"command": "npx",
"args": ["-y", "webcrawler-mcp"],
"env": {
"WEBCRAWLER_API_KEY": "your-api-key-here"
}
}
}
}
Available Tools
webcrawler-scrape
Scrapes content from any webpage and returns it as markdown.
Parameters:
url
(required): The URL of the webpage to scrapeprompt
(optional): Optional prompt to extract specific information from the page
Example usage:
Use the webcrawler-scrape tool to get the content from https://example.com
With a prompt for targeted extraction:
Use the webcrawler-scrape tool to scrape https://news.ycombinator.com with the prompt "Extract all article titles and their corresponding URLs"
The tool returns the scraped content as markdown, along with the page title and HTTP status code. When using a prompt, the API will focus on extracting the specific information you requested.
Running as Standalone App
You can also run the server as a standalone application:
Via npm
npm install
npm run build
npm start
Via npx
npx webcrawler-mcp
HTTP Mode
To run in HTTP mode instead of stdio:
USE_HTTP=true npx webcrawler-mcp
# or
npm run start:http
The HTTP server will start on port 8080 by default, with endpoints:
- Health check:
http://localhost:8080/health
- MCP endpoint:
http://localhost:8080/mcp
Environment Variables
WEBCRAWLER_API_KEY
: Your WebcrawlerAPI.com API key (required)USE_HTTP
: Set to "true" to use HTTP transport instead of stdioPORT
: HTTP server port (default: 8080)
Development
npm install
npm run dev # Watch mode
npm run build