MCPDocSearch

MCPDocSearch

3.4

If you are the rightful owner of MCPDocSearch and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Documentation Crawler & MCP Server is a toolset for crawling websites, generating Markdown documentation, and making it searchable via an MCP server.

The Documentation Crawler & MCP Server project provides a comprehensive toolset for crawling websites, generating Markdown documentation, and making that documentation searchable through a Model Context Protocol (MCP) server. This server is designed for seamless integration with tools like Cursor, allowing users to efficiently search and interact with documentation. The project includes a web crawler that can be configured for depth, URL patterns, and content types, and an MCP server that processes and caches Markdown files, generating vector embeddings for semantic search. The server exposes tools for listing documents, retrieving document headings, and performing semantic searches, all accessible via MCP clients.

Features

  • Web Crawler: Crawls websites starting from a given URL, configurable for depth, URL patterns, and content types, and generates Markdown files.
  • MCP Server: Loads and parses Markdown files, generates vector embeddings, and caches processed data for efficient semantic search.
  • Caching: Utilizes a cache file to store processed chunks and embeddings, improving server startup times after initial processing.
  • Cursor Integration: Designed to run the MCP server via stdio transport for use within Cursor.

Tools

  1. list_documents

    Lists available crawled documents.

  2. get_document_headings

    Retrieves the heading structure for a document.

  3. search_documentation

    Performs semantic search over document chunks using vector similarity.