web-research-mcp-server

CamC9/web-research-mcp-server

3.3

If you are the rightful owner of web-research-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

The Web Research MCP Server is a comprehensive tool designed to enhance AI assistants' internet research capabilities by enabling web searches, website scraping, and URL content fetching.

Tools
3
Resources
0
Prompts
0

Web Research MCP Server

A comprehensive Model Context Protocol (MCP) server that provides internet research capabilities for AI assistants. This server enables AI agents to search the web, scrape websites, and fetch content from URLs.

Features

🔍 Web Search

  • Search the web using DuckDuckGo
  • Customizable result limits (1-20 results)
  • Returns structured results with titles, URLs, and snippets
  • Fallback search mechanisms for better reliability

🌐 Website Scraping

  • Extract content from any website
  • Smart content extraction using multiple selectors
  • Optional link and image extraction
  • Handles relative to absolute URL conversion
  • Content length limits for optimal performance

📡 URL Fetching

  • Fetch raw content from any URL
  • Support for text, JSON, and HTML formats
  • Useful for APIs, RSS feeds, and structured data
  • Returns status codes and headers

Installation

Prerequisites

  • Node.js 16 or higher
  • npm or yarn

Setup

  1. Clone or download this repository
  2. Install dependencies:
    npm install
    
  3. Build the server:
    npm run build
    

Configuration

Add the server to your MCP client configuration:

For Cline/Claude Dev

Add to your MCP settings file:

{
  "mcpServers": {
    "web-research-server": {
      "disabled": false,
      "autoApprove": [],
      "command": "node",
      "args": ["/path/to/web-research-server/build/index.js"]
    }
  }
}
For Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "web-research-server": {
      "command": "node",
      "args": ["/path/to/web-research-server/build/index.js"]
    }
  }
}

Usage

Once configured, you can use the following tools:

Web Search

Search for "latest AI developments 2024" with 5 results

Website Scraping

Scrape the content from https://example.com and extract all links

URL Fetching

Fetch the JSON data from https://api.example.com/data

Tools Reference

web_search

Search the web using DuckDuckGo.

Parameters:

  • query (string, required): Search query
  • limit (number, optional): Maximum results (1-20, default: 10)

scrape_website

Extract content from a website.

Parameters:

  • url (string, required): Website URL to scrape
  • extract_links (boolean, optional): Extract all links (default: false)
  • extract_images (boolean, optional): Extract all images (default: false)

fetch_url

Fetch raw content from a URL.

Parameters:

  • url (string, required): URL to fetch
  • format (string, optional): Expected format - "text", "json", or "html" (default: "text")

Development

Building

npm run build

Watching for changes

npm run watch

Technical Details

  • Built with TypeScript and the MCP SDK
  • Uses axios for HTTP requests
  • Uses cheerio for HTML parsing
  • Implements proper error handling and timeouts
  • Includes content length limits for performance
  • Supports graceful shutdown

Security Features

  • Request timeouts (30 seconds)
  • Content length limits
  • URL validation
  • Error handling for malformed requests
  • No automatic approval for security

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

If you encounter any issues or have questions, please open an issue on the GitHub repository.