praveenc/fetchv2-mcp-server
If you are the rightful owner of fetchv2-mcp-server and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.
FetchV2 MCP Server is a Model Context Protocol server designed for efficient web content fetching and extraction.
FetchV2 MCP Server
Model Context Protocol (MCP) server for web content fetching and extraction.
This MCP server provides tools to fetch webpages, extract clean content using Trafilatura, and discover links for batch processing.
Features
- Fetch Webpages: Extract clean markdown content from any URL
- Batch Fetching: Fetch up to 10 URLs in a single request
- Link Discovery: Find and filter links on any webpage
- llms.txt Support: Parse and fetch LLM-friendly documentation indexes
- Smart Extraction: Trafilatura removes boilerplate (navbars, ads, footers)
- Robots.txt Compliance: Respects robots.txt with graceful timeout handling
- Pagination Support: Handle large pages with
start_indexparameter
Prerequisites
- Install
uvfrom Astral - Install Python 3.10 or newer using
uv python install 3.10
Installation
| Cursor | VS Code |
|---|---|
| Install MCP Server | Install on VS Code |
Or configure manually in your MCP client:
{
"mcpServers": {
"fetchv2": {
"command": "uvx",
"args": ["fetchv2-mcp-server@latest"],
"disabled": false,
"autoApprove": []
}
}
}
Config file locations:
- Claude Desktop (macOS):
~/Library/Application Support/Claude/claude_desktop_config.json - Claude Desktop (Windows):
%APPDATA%\Claude\claude_desktop_config.json - Windsurf:
~/.codeium/windsurf/mcp_config.json - Kiro:
.kiro/settings/mcp.jsonin your project
Install from PyPI
# Using uv
uv add fetchv2-mcp-server
# Using pip
pip install fetchv2-mcp-server
Basic Usage
Example prompts to try:
- "Fetch the documentation from
<URL>" - "Find all links on
<docs URL>that contain 'tutorial'" - "Read these three pages and summarize the differences:
[url1, url2, url3]"
Available Tools
fetch
Fetches a webpage and extracts its main content as clean markdown.
fetch(url: str, max_length: int = 5000, start_index: int = 0) -> str
| Parameter | Type | Default | Description |
|---|---|---|---|
url | str | required | The webpage URL to fetch |
max_length | int | 5000 | Maximum characters to return |
start_index | int | 0 | Character offset for pagination |
get_raw_html | bool | false | Skip extraction, return raw HTML |
include_metadata | bool | true | Include title, author, date |
include_tables | bool | true | Preserve tables in markdown |
include_links | bool | false | Preserve hyperlinks |
bypass_robots_txt | bool | false | Skip robots.txt check |
fetch_batch
Fetches multiple webpages in a single request.
fetch_batch(urls: list[str], max_length_per_url: int = 2000) -> str
| Parameter | Type | Default | Description |
|---|---|---|---|
urls | list[str] | required | List of URLs (max 10) |
max_length_per_url | int | 2000 | Character limit per URL |
get_raw_html | bool | false | Skip extraction for all URLs |
discover_links
Discovers all links on a webpage with optional filtering.
discover_links(url: str, filter_pattern: str = "") -> str
| Parameter | Type | Default | Description |
|---|---|---|---|
url | str | required | The webpage URL to scan |
filter_pattern | str | "" | Regex to filter links (e.g., /docs/) |
fetch_llms_txt
Fetch and parse an llms.txt file to discover LLM-friendly documentation.
fetch_llms_txt(url: str, include_content: bool = False) -> str
| Parameter | Type | Default | Description |
|---|---|---|---|
url | str | required | URL to an llms.txt file |
include_content | bool | false | Also fetch content of all linked pages |
max_length_per_url | int | 2000 | When include_content=True, max chars per page |
⚠️ Important: By default, only the llms.txt index is fetched — the linked markdown files are NOT downloaded to context. Set
include_content=Trueto explicitly fetch all linked pages.
Example:
# DEFAULT: Only fetches the index (lightweight, ~1KB)
fetch_llms_txt(url="https://docs.example.com/llms.txt")
# Returns: title + list of links with descriptions
# EXPLICIT: Fetches index + all linked .md files (can be large)
fetch_llms_txt(url="https://docs.example.com/llms.txt", include_content=True)
# Returns: structure + content of all linked pages
Note: Relative URLs (e.g., /docs/guide.md) are automatically resolved to absolute URLs.
Workflow Example
Step 1: Discover relevant documentation pages
discover_links(url="https://docs.example.com/", filter_pattern="/guide/")
Step 2: Batch fetch the pages you need
fetch_batch(urls=["https://docs.example.com/guide/intro", "https://docs.example.com/guide/setup"])
Prompts
- fetch_manual - User-initiated fetch that bypasses robots.txt
- research_topic - Research a topic by fetching multiple relevant URLs
Development
# Clone and install
git clone https://github.com/praveenc/fetchv2-mcp-server.git
cd fetchv2-mcp-server
uv sync --dev
source .venv/bin/activate
# Run tests
uv run pytest
# Run with MCP Inspector
mcp dev src/fetchv2_mcp_server/server.py
# Linting and type checking
uv run ruff check .
uv run pyright
License
MIT - see for details.
Contributing
Contributions welcome! Please see for guidelines.
Support
For issues and questions, use the GitHub issue tracker.