mcp-broken-link-checker

davinoishi/mcp-broken-link-checker

3.2

If you are the rightful owner of mcp-broken-link-checker and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Broken Link Checker MCP Server is a Model Context Protocol server designed to provide comprehensive broken link checking capabilities using streamable HTTP transport.

Tools
4
Resources
0
Prompts
0

Broken Link Checker MCP Server

A Model Context Protocol (MCP) server that provides broken link checking capabilities with streamable HTTP transport. This server enables AI assistants like Claude to check URLs, scan web pages, and crawl entire websites for broken links.

Features

  • Four powerful tools for link checking:

    • check_url - Verify a single URL
    • check_html - Check all links in HTML content
    • check_page - Fetch and check all links on a webpage
    • check_site - Recursively crawl and check an entire website
  • Streamable HTTP transport - Serverless-ready, no persistent connections required

  • JSON-RPC 2.0 - Standard MCP protocol support

  • Comprehensive results - Status codes, redirects, response times, and detailed error messages

  • Configurable options - Timeouts, redirect handling, custom user agents

Installation

npm install
npm run build

Usage

Running the Server

Development mode:

npm run dev

Production mode:

npm run build
npm start

The server will start on port 3000 by default (configurable via PORT environment variable).

Endpoints

  • Health check: GET /health
  • MCP endpoint: POST /mcp

Using with Claude Code

Add to your MCP settings:

claude mcp add --transport http broken-link-checker http://localhost:3000/mcp

Or manually add to your MCP configuration:

{
  "mcpServers": {
    "broken-link-checker": {
      "transport": "http",
      "url": "http://localhost:3000/mcp"
    }
  }
}

Tools

1. check_url

Check if a single URL is working or broken.

Parameters:

  • url (string, required) - The URL to check
  • timeout (number, optional) - Request timeout in milliseconds (default: 30000)
  • followRedirects (boolean, optional) - Whether to follow redirects (default: true)
  • maxRedirects (number, optional) - Maximum number of redirects (default: 5)
  • userAgent (string, optional) - Custom User-Agent string

Example:

{
  "tool": "check_url",
  "arguments": {
    "url": "https://example.com",
    "timeout": 10000
  }
}

Response:

{
  "url": "https://example.com",
  "status": 200,
  "statusText": "OK",
  "broken": false,
  "responseTime": 123
}

2. check_html

Check all links within HTML content.

Parameters:

  • html (string, required) - HTML content to check
  • baseUrl (string, required) - Base URL for resolving relative links
  • All options from check_url

Example:

{
  "tool": "check_html",
  "arguments": {
    "html": "<a href='/about'>About</a>",
    "baseUrl": "https://example.com"
  }
}

Response:

{
  "totalLinks": 5,
  "brokenLinks": 1,
  "workingLinks": 4,
  "results": [...]
}

3. check_page

Fetch a webpage and check all links on it.

Parameters:

  • url (string, required) - The URL of the page to check
  • All options from check_url

Example:

{
  "tool": "check_page",
  "arguments": {
    "url": "https://example.com"
  }
}

Response:

{
  "page": {
    "url": "https://example.com",
    "status": 200,
    "broken": false
  },
  "totalLinks": 25,
  "brokenLinks": 2,
  "workingLinks": 23,
  "links": [...]
}

4. check_site

Recursively crawl and check all links on a website.

Parameters:

  • url (string, required) - The starting URL of the site
  • maxPages (number, optional) - Maximum pages to check (default: 50)
  • All options from check_url

Example:

{
  "tool": "check_site",
  "arguments": {
    "url": "https://example.com",
    "maxPages": 10
  }
}

Response:

{
  "totalPages": 10,
  "totalLinks": 150,
  "brokenLinks": 5,
  "workingLinks": 145,
  "pages": [...]
}

Conversational Usage Examples

Once connected to Claude Code, you can use natural language:

"Check if https://example.com is working"
"Scan https://mysite.com and find all broken links"
"Crawl https://blog.example.com and report any 404 errors"
"Check all the links on this page: https://docs.example.com/api"

API Examples

Direct HTTP Request

curl -X POST http://localhost:3000/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "check_url",
      "arguments": {
        "url": "https://example.com"
      }
    }
  }'

List Available Tools

curl -X POST http://localhost:3000/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/list"
  }'

Deployment

Docker

Create a Dockerfile:

FROM node:22-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --production

COPY dist ./dist

EXPOSE 3000

CMD ["node", "dist/index.js"]

Build and run:

npm run build
docker build -t broken-link-checker-mcp .
docker run -p 3000:3000 broken-link-checker-mcp

Cloud Deployment

This server uses streamable HTTP transport and is serverless-ready. Deploy to:

  • Google Cloud Run - Scales to zero when idle
  • AWS Elastic Beanstalk - Managed Node.js hosting
  • Azure App Service - Integrated deployment
  • Render - One-click deployment
  • Railway - Git-based deployment

Architecture

This implementation uses:

  • Express.js - HTTP server
  • MCP SDK - Model Context Protocol implementation
  • got - HTTP client for link checking
  • parse5 - HTML parsing
  • TypeScript - Type-safe implementation
  • Zod - Runtime schema validation

Comparison with Original broken-link-checker

This MCP server provides:

  • ✅ MCP protocol support for AI integration
  • ✅ HTTP transport (stateless, serverless-ready)
  • ✅ Simplified, focused API
  • ✅ Modern TypeScript implementation

Original library offers:

  • Advanced HTML parsing options
  • Robot exclusion protocol support
  • CLI interface
  • EventEmitter-based streaming
  • More granular configuration

License

MIT

Contributing

Contributions welcome! This is a reference implementation that can be extended with:

  • Additional link checking options
  • Better error handling
  • Caching layer
  • Rate limiting
  • Authentication support
  • Webhook notifications

Credits

Inspired by broken-link-checker by Steven Vachon.