ispyridis/calibre-rag-mcp-nodejs
If you are the rightful owner of calibre-rag-mcp-nodejs and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.
Calibre RAG MCP Server is an enhanced server designed for project-based vector search and contextual conversations, utilizing Retrieval-Augmented Generation (RAG) capabilities.
Calibre RAG MCP Server
Enhanced Calibre MCP server with RAG (Retrieval-Augmented Generation) capabilities for project-based vector search and contextual conversations.
Features
- RAG-Enhanced Search: Vector-based semantic search using FAISS and Transformers
- Project-Based Organization: Create isolated vector search projects for different contexts
- Multi-Format Support: Process books in various formats (EPUB, PDF, MOBI, etc.)
- OCR Capabilities: Extract text from images and scanned PDFs using Tesseract
- Advanced Text Processing: Natural language processing for better content understanding
- Windows Compatible: Designed specifically for Windows environments
Technologies Used
- Vector Search: FAISS for efficient similarity search
- Embeddings: Xenova Transformers for local embedding generation
- OCR: Tesseract for optical character recognition
- PDF Processing: Multiple PDF parsing libraries (pdf-parse, pdf-poppler, pdf2pic)
- Image Processing: Sharp for image manipulation
- NLP: Natural language processing with multiple libraries
Prerequisites
- Node.js >= 16.0.0
- Calibre installed on Windows
- ImageMagick (for enhanced image processing)
- Tesseract OCR (for text extraction from images)
Installation
- Clone this repository:
git clone https://github.com/yourusername/calibre-rag-mcp-nodejs.git
cd calibre-rag-mcp-nodejs
- Install dependencies:
npm install
- Run setup (Windows):
setup.bat
Configuration
The server automatically detects your Calibre library location. For custom configurations, modify the settings in server.js
.
Usage
Starting the Server
npm start
Available Tools
search
: Semantic search across your ebook libraryfetch
: Retrieve specific content from bookslist_projects
: List all RAG projectscreate_project
: Create a new RAG projectadd_books_to_project
: Add books to a project for vectorizationsearch_project_context
: Search within specific projects
Example MCP Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"calibre-rag": {
"command": "node",
"args": ["path/to/calibre-rag-mcp-nodejs/server.js"]
}
}
}
Project Structure
calibre-rag-mcp-nodejs/
āāā server.js # Main MCP server
āāā package.json # Dependencies and scripts
āāā setup.bat # Windows setup script
āāā test-*.js # Various test files
āāā projects/ # RAG projects storage
āāā CONFIG.md # Configuration documentation
āāā USAGE_EXAMPLES.md # Usage examples
āāā QUICK_TEST.md # Quick testing guide
Testing
Run the test suite:
npm test
Individual test files:
test-enhanced-server.js
- Enhanced server functionalitytest-ocr-full.js
- OCR capabilitiestest-pdf-approaches.js
- PDF processingtest-enhanced-auto.js
- Automated testing
Documentation
Requirements
System Requirements
- Windows 10/11
- Node.js 16+
- Calibre installed
- At least 4GB RAM (8GB+ recommended for large libraries)
Optional Dependencies
- ImageMagick (for enhanced image processing)
- Tesseract OCR (for text extraction from scanned documents)
Troubleshooting
Common Issues
- FAISS Installation: If FAISS fails to install, ensure you have proper build tools
- Tesseract Not Found: Install Tesseract and add to PATH
- Memory Issues: Reduce batch sizes for large document processing
Debug Mode
Enable verbose logging by setting environment variable:
set DEBUG=calibre-rag:*
npm start
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
License
Licensed under the Apache License 2.0. See LICENSE file for details.
Support
For issues and questions, please open an issue on GitHub.
Changelog
v1.0.0
- Initial release with RAG capabilities
- Project-based vector search
- Multi-format document support
- OCR integration
- Windows optimization