mcp-crawl4ai-rag

mcp-crawl4ai-rag

3.8

If you are the rightful owner of mcp-crawl4ai-rag and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Crawl4AI RAG MCP Server is a powerful implementation of the Model Context Protocol (MCP) integrated with Crawl4AI and Supabase, providing AI agents and AI coding assistants with advanced web crawling and RAG capabilities.

The Crawl4AI RAG MCP Server enables AI agents to crawl websites, store content in a vector database, and perform RAG over the crawled content. It includes advanced RAG strategies like contextual embeddings, hybrid search, agentic RAG, reranking, and knowledge graph for AI hallucination detection. The server is designed to be integrated into Archon to create a comprehensive knowledge engine for AI coding assistants. Future improvements include support for multiple embedding models and enhanced chunking strategies.

Features

  • Smart URL Detection: Automatically detects and handles different URL types.
  • Recursive Crawling: Follows internal links to discover content.
  • Parallel Processing: Efficiently crawls multiple pages simultaneously.
  • Content Chunking: Intelligently splits content by headers and size for better processing.
  • Vector Search: Performs RAG over crawled content with optional source filtering.

Tools

  1. crawl_single_page

    Quickly crawl a single web page and store its content in the vector database.

  2. smart_crawl_url

    Intelligently crawl a full website based on the type of URL provided.

  3. get_available_sources

    Get a list of all available sources (domains) in the database.

  4. perform_rag_query

    Search for relevant content using semantic search with optional source filtering.

  5. search_code_examples

    Search specifically for code examples and their summaries from crawled documentation.

  6. parse_github_repository

    Parse a GitHub repository into a Neo4j knowledge graph.

  7. check_ai_script_hallucinations

    Analyze Python scripts for AI hallucinations by validating imports, method calls, and class usage.

  8. query_knowledge_graph

    Explore and query the Neo4j knowledge graph.