MCP-DOC-Server-OpenRouter

MCP-DOC-Server-OpenRouter

3.3

If you are the rightful owner of MCP-DOC-Server-OpenRouter and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

A MCP server for fetching and searching 3rd party package documentation.

This project provides a Model Context Protocol (MCP) server designed to scrape, process, index, and search documentation for various software libraries and packages. It fetches content from specified URLs, splits it into meaningful chunks using semantic splitting techniques, generates vector embeddings using OpenAI, and stores the data in an SQLite database. The server utilizes `sqlite-vec` for efficient vector similarity search and FTS5 for full-text search capabilities, combining them for hybrid search results. It supports versioning, allowing documentation for different library versions (including unversioned content) to be stored and queried distinctly.

Features

  • Versatile Scraping: Fetch documentation from diverse sources like websites, GitHub, npm, PyPI, or local files.
  • Intelligent Processing: Automatically split content semantically and generate embeddings using your choice of models (OpenAI, Google Gemini, Azure OpenAI, AWS Bedrock, Ollama, and more).
  • Optimized Storage: Leverage SQLite with `sqlite-vec` for efficient vector storage and FTS5 for robust full-text search.
  • Powerful Hybrid Search: Combine vector similarity and full-text search across different library versions for highly relevant results.
  • Asynchronous Job Handling: Manage scraping and indexing tasks efficiently with a background job queue and MCP/CLI tools.

Tools

  1. scrape_docs

    Starts a scraping job and returns a jobId immediately.

  2. get_job_status

    Retrieves the current status and progress of a specific job.

  3. list_jobs

    Shows recent and ongoing jobs.

  4. cancel_job

    Attempts to stop a running or queued job.

  5. search_docs

    Searches documentation.

  6. list_libraries

    Lists indexed libraries.

  7. find_version

    Finds appropriate versions.

  8. remove_docs

    Removes indexed documents.

  9. fetch_url

    Fetches a URL and returns its content as Markdown.