webcrawl-mcp
If you are the rightful owner of webcrawl-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
The Webcrawl MCP Server is a production-ready implementation of the Model Context Protocol (MCP) designed for comprehensive web crawling and intelligent content extraction.
The Webcrawl MCP Server is a robust and fully compliant implementation of the Model Context Protocol (MCP), offering advanced web crawling capabilities. It is designed to handle complex web crawling tasks with features like intelligent content extraction, link analysis, and sitemap generation. The server is built to be production-ready, ensuring reliability and efficiency in handling large-scale web data extraction. It supports modern transport protocols and provides a suite of tools for various web-related tasks, including content search and web search. The server is also equipped with an abort functionality to gracefully cancel long-running operations, making it a versatile tool for developers and businesses looking to automate web data collection and analysis.
Features
- 100% MCP Compliant: Ensures full compliance with the latest MCP specification, providing a reliable and standardized protocol implementation.
- Abort Functionality: Allows for the graceful cancellation of long-running operations, enhancing control and flexibility.
- Smart Crawling: Features intelligent content extraction with relevance scoring to prioritize important information.
- Link Analysis: Offers advanced link extraction and categorization for comprehensive web data analysis.
- Sitemap Generation: Automatically generates detailed sitemaps to map out website structures efficiently.
Tools
crawl
Basic web crawling tool for extracting page content, metadata, and links.
smartCrawl
Intelligent crawling tool with relevance scoring and smart navigation.
extractLinks
Tool for extracting and categorizing links from a web page.
searchInPage
Tool for searching specific content within a web page.
generateSitemap
Tool for creating a sitemap by crawling website pages.
webSearch
Tool for performing web searches with multi-engine support.
getDateTime
Utility tool for date/time functions with timezone support.