calibre-rag-mcp-nodejs

ispyridis/calibre-rag-mcp-nodejs

3.2

If you are the rightful owner of calibre-rag-mcp-nodejs and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Calibre RAG MCP Server is an enhanced server designed for project-based vector search and contextual conversations, utilizing Retrieval-Augmented Generation (RAG) capabilities.

Tools
6
Resources
0
Prompts
0

Calibre RAG MCP Server

Enhanced Calibre MCP server with RAG (Retrieval-Augmented Generation) capabilities for project-based vector search and contextual conversations.

Features

  • RAG-Enhanced Search: Vector-based semantic search using FAISS and Transformers
  • Project-Based Organization: Create isolated vector search projects for different contexts
  • Multi-Format Support: Process books in various formats (EPUB, PDF, MOBI, etc.)
  • OCR Capabilities: Extract text from images and scanned PDFs using Tesseract
  • Advanced Text Processing: Natural language processing for better content understanding
  • Windows Compatible: Designed specifically for Windows environments

Technologies Used

  • Vector Search: FAISS for efficient similarity search
  • Embeddings: Xenova Transformers for local embedding generation
  • OCR: Tesseract for optical character recognition
  • PDF Processing: Multiple PDF parsing libraries (pdf-parse, pdf-poppler, pdf2pic)
  • Image Processing: Sharp for image manipulation
  • NLP: Natural language processing with multiple libraries

Prerequisites

  • Node.js >= 16.0.0
  • Calibre installed on Windows
  • ImageMagick (for enhanced image processing)
  • Tesseract OCR (for text extraction from images)

Installation

  1. Clone this repository:
git clone https://github.com/yourusername/calibre-rag-mcp-nodejs.git
cd calibre-rag-mcp-nodejs
  1. Install dependencies:
npm install
  1. Run setup (Windows):
setup.bat

Configuration

The server automatically detects your Calibre library location. For custom configurations, modify the settings in server.js.

Usage

Starting the Server

npm start

Available Tools

  • search: Semantic search across your ebook library
  • fetch: Retrieve specific content from books
  • list_projects: List all RAG projects
  • create_project: Create a new RAG project
  • add_books_to_project: Add books to a project for vectorization
  • search_project_context: Search within specific projects

Example MCP Configuration

Add to your MCP client configuration:

{
  "mcpServers": {
    "calibre-rag": {
      "command": "node",
      "args": ["path/to/calibre-rag-mcp-nodejs/server.js"]
    }
  }
}

Project Structure

calibre-rag-mcp-nodejs/
ā”œā”€ā”€ server.js              # Main MCP server
ā”œā”€ā”€ package.json           # Dependencies and scripts
ā”œā”€ā”€ setup.bat              # Windows setup script
ā”œā”€ā”€ test-*.js              # Various test files
ā”œā”€ā”€ projects/              # RAG projects storage
ā”œā”€ā”€ CONFIG.md              # Configuration documentation
ā”œā”€ā”€ USAGE_EXAMPLES.md      # Usage examples
└── QUICK_TEST.md          # Quick testing guide

Testing

Run the test suite:

npm test

Individual test files:

  • test-enhanced-server.js - Enhanced server functionality
  • test-ocr-full.js - OCR capabilities
  • test-pdf-approaches.js - PDF processing
  • test-enhanced-auto.js - Automated testing

Documentation

Requirements

System Requirements

  • Windows 10/11
  • Node.js 16+
  • Calibre installed
  • At least 4GB RAM (8GB+ recommended for large libraries)

Optional Dependencies

  • ImageMagick (for enhanced image processing)
  • Tesseract OCR (for text extraction from scanned documents)

Troubleshooting

Common Issues

  1. FAISS Installation: If FAISS fails to install, ensure you have proper build tools
  2. Tesseract Not Found: Install Tesseract and add to PATH
  3. Memory Issues: Reduce batch sizes for large document processing

Debug Mode

Enable verbose logging by setting environment variable:

set DEBUG=calibre-rag:*
npm start

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

License

Licensed under the Apache License 2.0. See LICENSE file for details.

Support

For issues and questions, please open an issue on GitHub.

Changelog

v1.0.0

  • Initial release with RAG capabilities
  • Project-based vector search
  • Multi-format document support
  • OCR integration
  • Windows optimization