autodocs-mcp

ziyacivan/autodocs-mcp

3.2

If you are the rightful owner of autodocs-mcp and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to dayong@mcphub.com.

autodocs-mcp is a CLI tool designed to generate Model Context Protocol (MCP) servers from ReadTheDocs documentation, enabling seamless integration with tools like VSCode.

Tools
2
Resources
0
Prompts
0

autodocs-mcp

PyPI version Python Version Code style: black Ruff CI

Generate Model Context Protocol (MCP) servers from ReadTheDocs documentation.

Overview

autodocs-mcp is a CLI tool that automatically scrapes ReadTheDocs documentation, generates embeddings for semantic search, and creates a ready-to-use MCP server that can be integrated with VSCode and other MCP-compatible tools.

Features

  • 🔍 Format Detection: Automatically detects documentation format (Sphinx, MkDocs, or generic)
  • 📚 Smart Scraping: Uses objects.inv for Sphinx docs, sitemap.xml for MkDocs, with fallback to HTML crawling
  • 🧠 Semantic Search: Generates embeddings and creates a vector store for semantic search
  • ⚙️ MCP Server Generation: Creates a fully functional MCP server with tools and resources
  • 🔌 VSCode Integration: Generates VSCode configuration for easy integration

Installation

Install from PyPI:

pip install autodocs-mcp

We recommend using uv for faster and more reliable package management:

uv pip install autodocs-mcp

Or install from source:

git clone https://github.com/ziyacivan/autodocs-mcp.git
cd autodocs-mcp
uv sync

Usage

After installation, you can use autodocs-mcp directly from the terminal:

Basic Usage

autodocs-mcp generate https://docs.example.com/

Alternatively, you can run it as a Python module:

python -m autodocs_mcp generate https://docs.example.com/

Options

autodocs-mcp generate <readthedocs_url> \
  --output-dir ./mcp-server \
  --embedding-model all-MiniLM-L6-v2 \
  --python-path python

Options:

  • --output-dir: Output directory for generated files (default: ./mcp-server)
  • --embedding-model: Embedding model to use (default: all-MiniLM-L6-v2)
  • --cache-dir: Cache directory (default: output-dir/cache)
  • --python-path: Path to Python interpreter (default: python)

Example

# Generate MCP server for a documentation site
autodocs-mcp generate https://docs.readthedocs.io/en/stable/

# This will:
# 1. Detect the documentation format
# 2. Scrape all pages
# 3. Generate embeddings
# 4. Create vector store
# 5. Generate MCP server code
# 6. Create VSCode configuration

Output Structure

After running the tool, you'll get:

mcp-server/
├── mcp_server.py          # Generated MCP server
├── vector_store/          # ChromaDB vector store
├── vscode_config.json     # VSCode configuration
└── cache/                 # Cached content (optional)

VSCode Integration

  1. The tool generates a vscode_config.json file with the MCP server configuration.

  2. Add the configuration to your VSCode settings.json:

{
  "mcp.servers": {
    "docs-example-com": {
      "command": "python",
      "args": ["/path/to/mcp-server/mcp_server.py"]
    }
  }
}
  1. Restart VSCode to load the MCP server.

How It Works

Format Detection

The tool automatically detects the documentation format:

  1. Sphinx: Checks for objects.inv file
  2. MkDocs: Checks for sitemap.xml file
  3. Generic: Falls back to HTML crawling

Scraping Process

  • Sphinx: Uses sphobjinv to parse objects.inv and extract all documentation objects
  • MkDocs: Parses sitemap.xml or analyzes HTML navigation structure
  • Generic: Crawls HTML pages starting from the index page

Embedding Generation

  • Splits content into chunks (configurable size and overlap)
  • Generates embeddings using sentence transformers
  • Stores embeddings in ChromaDB vector store

MCP Server Features

The generated MCP server provides:

  • Resources: List of all documentation pages
  • Tools:
    • search_documentation: Semantic search across documentation
    • get_page_content: Get full content of a specific page

Requirements

  • Python 3.10+
  • See pyproject.toml for full dependency list

Development

Local Development Setup

For local development, install the package in editable mode:

# Clone the repository
git clone https://github.com/ziyacivan/autodocs-mcp.git
cd autodocs-mcp

# Install in editable mode with development dependencies
pip install -e ".[dev]"

# Or using uv
uv sync --extra dev

After installation, you can use the CLI tool:

# Using the CLI command (after editable install)
autodocs-mcp --help

# Or as a Python module
python -m autodocs_mcp --help

Testing

# Run tests
pytest

# Run tests with coverage
pytest --cov=src/autodocs_mcp --cov-report=html

# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Fix auto-fixable linting issues
ruff check --fix src/ tests/

# Install pre-commit hooks (optional but recommended)
pre-commit install

License

MIT License - see LICENSE file for details.

Troubleshooting

Common Issues

Issue: "No pages found"

  • Ensure the URL is correct and accessible
  • Check if the documentation site requires authentication
  • Verify the site is using a supported format (Sphinx, MkDocs, or generic HTML)

Issue: "Could not find Python executable"

  • Specify the Python path explicitly using --python-path
  • Ensure Python 3.10+ is installed and in your PATH

Issue: Embedding model download fails

  • Check your internet connection
  • The model will be downloaded on first use from Hugging Face
  • Ensure you have sufficient disk space (~100MB per model)

Issue: MCP server not working in VSCode

  • Verify the Python path in vscode_config.json is correct
  • Ensure all dependencies are installed: uv pip install chromadb sentence-transformers mcp (or pip install chromadb sentence-transformers mcp)
  • Check VSCode MCP extension is installed and enabled
  • Restart VSCode after configuration changes

Performance Tips

  • Use a smaller embedding model (e.g., all-MiniLM-L6-v2) for faster processing
  • Enable caching to avoid re-scraping documentation
  • For large documentation sites, consider processing in batches

Roadmap

  • Support for additional documentation formats
  • Incremental updates (only scrape changed pages)
  • Custom chunking strategies
  • Multiple embedding model support
  • Docker containerization
  • Pre-built MCP servers for popular documentation sites

Contributing

Contributions are welcome! Please read our for details on our code of conduct and the process for submitting pull requests.

License

MIT License - see file for details.

Acknowledgments