davinoishi/mcp-broken-link-checker
If you are the rightful owner of mcp-broken-link-checker and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Broken Link Checker MCP Server is a Model Context Protocol server designed to provide comprehensive broken link checking capabilities using streamable HTTP transport.
Broken Link Checker MCP Server
A Model Context Protocol (MCP) server that provides broken link checking capabilities with streamable HTTP transport. This server enables AI assistants like Claude to check URLs, scan web pages, and crawl entire websites for broken links.
Features
-
Four powerful tools for link checking:
check_url
- Verify a single URLcheck_html
- Check all links in HTML contentcheck_page
- Fetch and check all links on a webpagecheck_site
- Recursively crawl and check an entire website
-
Streamable HTTP transport - Serverless-ready, no persistent connections required
-
JSON-RPC 2.0 - Standard MCP protocol support
-
Comprehensive results - Status codes, redirects, response times, and detailed error messages
-
Configurable options - Timeouts, redirect handling, custom user agents
Installation
npm install
npm run build
Usage
Running the Server
Development mode:
npm run dev
Production mode:
npm run build
npm start
The server will start on port 3000 by default (configurable via PORT
environment variable).
Endpoints
- Health check:
GET /health
- MCP endpoint:
POST /mcp
Using with Claude Code
Add to your MCP settings:
claude mcp add --transport http broken-link-checker http://localhost:3000/mcp
Or manually add to your MCP configuration:
{
"mcpServers": {
"broken-link-checker": {
"transport": "http",
"url": "http://localhost:3000/mcp"
}
}
}
Tools
1. check_url
Check if a single URL is working or broken.
Parameters:
url
(string, required) - The URL to checktimeout
(number, optional) - Request timeout in milliseconds (default: 30000)followRedirects
(boolean, optional) - Whether to follow redirects (default: true)maxRedirects
(number, optional) - Maximum number of redirects (default: 5)userAgent
(string, optional) - Custom User-Agent string
Example:
{
"tool": "check_url",
"arguments": {
"url": "https://example.com",
"timeout": 10000
}
}
Response:
{
"url": "https://example.com",
"status": 200,
"statusText": "OK",
"broken": false,
"responseTime": 123
}
2. check_html
Check all links within HTML content.
Parameters:
html
(string, required) - HTML content to checkbaseUrl
(string, required) - Base URL for resolving relative links- All options from
check_url
Example:
{
"tool": "check_html",
"arguments": {
"html": "<a href='/about'>About</a>",
"baseUrl": "https://example.com"
}
}
Response:
{
"totalLinks": 5,
"brokenLinks": 1,
"workingLinks": 4,
"results": [...]
}
3. check_page
Fetch a webpage and check all links on it.
Parameters:
url
(string, required) - The URL of the page to check- All options from
check_url
Example:
{
"tool": "check_page",
"arguments": {
"url": "https://example.com"
}
}
Response:
{
"page": {
"url": "https://example.com",
"status": 200,
"broken": false
},
"totalLinks": 25,
"brokenLinks": 2,
"workingLinks": 23,
"links": [...]
}
4. check_site
Recursively crawl and check all links on a website.
Parameters:
url
(string, required) - The starting URL of the sitemaxPages
(number, optional) - Maximum pages to check (default: 50)- All options from
check_url
Example:
{
"tool": "check_site",
"arguments": {
"url": "https://example.com",
"maxPages": 10
}
}
Response:
{
"totalPages": 10,
"totalLinks": 150,
"brokenLinks": 5,
"workingLinks": 145,
"pages": [...]
}
Conversational Usage Examples
Once connected to Claude Code, you can use natural language:
"Check if https://example.com is working"
"Scan https://mysite.com and find all broken links"
"Crawl https://blog.example.com and report any 404 errors"
"Check all the links on this page: https://docs.example.com/api"
API Examples
Direct HTTP Request
curl -X POST http://localhost:3000/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "check_url",
"arguments": {
"url": "https://example.com"
}
}
}'
List Available Tools
curl -X POST http://localhost:3000/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}'
Deployment
Docker
Create a Dockerfile
:
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]
Build and run:
npm run build
docker build -t broken-link-checker-mcp .
docker run -p 3000:3000 broken-link-checker-mcp
Cloud Deployment
This server uses streamable HTTP transport and is serverless-ready. Deploy to:
- Google Cloud Run - Scales to zero when idle
- AWS Elastic Beanstalk - Managed Node.js hosting
- Azure App Service - Integrated deployment
- Render - One-click deployment
- Railway - Git-based deployment
Architecture
This implementation uses:
- Express.js - HTTP server
- MCP SDK - Model Context Protocol implementation
- got - HTTP client for link checking
- parse5 - HTML parsing
- TypeScript - Type-safe implementation
- Zod - Runtime schema validation
Comparison with Original broken-link-checker
This MCP server provides:
- ✅ MCP protocol support for AI integration
- ✅ HTTP transport (stateless, serverless-ready)
- ✅ Simplified, focused API
- ✅ Modern TypeScript implementation
Original library offers:
- Advanced HTML parsing options
- Robot exclusion protocol support
- CLI interface
- EventEmitter-based streaming
- More granular configuration
License
MIT
Contributing
Contributions welcome! This is a reference implementation that can be extended with:
- Additional link checking options
- Better error handling
- Caching layer
- Rate limiting
- Authentication support
- Webhook notifications
Credits
Inspired by broken-link-checker by Steven Vachon.