Skip to main content
Glama

Calibre RAG MCP Server

by ispyridis
README.md4.32 kB
# Calibre RAG MCP Server Enhanced Calibre MCP server with RAG (Retrieval-Augmented Generation) capabilities for project-based vector search and contextual conversations. ## Features - **RAG-Enhanced Search**: Vector-based semantic search using FAISS and Transformers - **Project-Based Organization**: Create isolated vector search projects for different contexts - **Multi-Format Support**: Process books in various formats (EPUB, PDF, MOBI, etc.) - **OCR Capabilities**: Extract text from images and scanned PDFs using Tesseract - **Advanced Text Processing**: Natural language processing for better content understanding - **Windows Compatible**: Designed specifically for Windows environments ## Technologies Used - **Vector Search**: FAISS for efficient similarity search - **Embeddings**: Xenova Transformers for local embedding generation - **OCR**: Tesseract for optical character recognition - **PDF Processing**: Multiple PDF parsing libraries (pdf-parse, pdf-poppler, pdf2pic) - **Image Processing**: Sharp for image manipulation - **NLP**: Natural language processing with multiple libraries ## Prerequisites - Node.js >= 16.0.0 - Calibre installed on Windows - ImageMagick (for enhanced image processing) - Tesseract OCR (for text extraction from images) ## Installation 1. Clone this repository: ```bash git clone https://github.com/yourusername/calibre-rag-mcp-nodejs.git cd calibre-rag-mcp-nodejs ``` 2. Install dependencies: ```bash npm install ``` 3. Run setup (Windows): ```bash setup.bat ``` ## Configuration The server automatically detects your Calibre library location. For custom configurations, modify the settings in `server.js`. ## Usage ### Starting the Server ```bash npm start ``` ### Available Tools - `search`: Semantic search across your ebook library - `fetch`: Retrieve specific content from books - `list_projects`: List all RAG projects - `create_project`: Create a new RAG project - `add_books_to_project`: Add books to a project for vectorization - `search_project_context`: Search within specific projects ### Example MCP Configuration Add to your MCP client configuration: ```json { "mcpServers": { "calibre-rag": { "command": "node", "args": ["path/to/calibre-rag-mcp-nodejs/server.js"] } } } ``` ## Project Structure ``` calibre-rag-mcp-nodejs/ ├── server.js # Main MCP server ├── package.json # Dependencies and scripts ├── setup.bat # Windows setup script ├── test-*.js # Various test files ├── projects/ # RAG projects storage ├── CONFIG.md # Configuration documentation ├── USAGE_EXAMPLES.md # Usage examples └── QUICK_TEST.md # Quick testing guide ``` ## Testing Run the test suite: ```bash npm test ``` Individual test files: - `test-enhanced-server.js` - Enhanced server functionality - `test-ocr-full.js` - OCR capabilities - `test-pdf-approaches.js` - PDF processing - `test-enhanced-auto.js` - Automated testing ## Documentation - [Configuration Guide](CONFIG.md) - [Usage Examples](USAGE_EXAMPLES.md) - [Quick Test Guide](QUICK_TEST.md) ## Requirements ### System Requirements - Windows 10/11 - Node.js 16+ - Calibre installed - At least 4GB RAM (8GB+ recommended for large libraries) ### Optional Dependencies - ImageMagick (for enhanced image processing) - Tesseract OCR (for text extraction from scanned documents) ## Troubleshooting ### Common Issues 1. **FAISS Installation**: If FAISS fails to install, ensure you have proper build tools 2. **Tesseract Not Found**: Install Tesseract and add to PATH 3. **Memory Issues**: Reduce batch sizes for large document processing ### Debug Mode Enable verbose logging by setting environment variable: ```bash set DEBUG=calibre-rag:* npm start ``` ## Contributing 1. Fork the repository 2. Create a feature branch 3. Make your changes 4. Add tests for new functionality 5. Submit a pull request ## License Licensed under the Apache License 2.0. See LICENSE file for details. ## Support For issues and questions, please open an issue on GitHub. ## Changelog ### v1.0.0 - Initial release with RAG capabilities - Project-based vector search - Multi-format document support - OCR integration - Windows optimization

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ispyridis/calibre-rag-mcp-nodejs'

If you have feedback or need assistance with the MCP directory API, please join our Discord server