README.md•4.32 kB
# Calibre RAG MCP Server
Enhanced Calibre MCP server with RAG (Retrieval-Augmented Generation) capabilities for project-based vector search and contextual conversations.
## Features
- **RAG-Enhanced Search**: Vector-based semantic search using FAISS and Transformers
- **Project-Based Organization**: Create isolated vector search projects for different contexts
- **Multi-Format Support**: Process books in various formats (EPUB, PDF, MOBI, etc.)
- **OCR Capabilities**: Extract text from images and scanned PDFs using Tesseract
- **Advanced Text Processing**: Natural language processing for better content understanding
- **Windows Compatible**: Designed specifically for Windows environments
## Technologies Used
- **Vector Search**: FAISS for efficient similarity search
- **Embeddings**: Xenova Transformers for local embedding generation
- **OCR**: Tesseract for optical character recognition
- **PDF Processing**: Multiple PDF parsing libraries (pdf-parse, pdf-poppler, pdf2pic)
- **Image Processing**: Sharp for image manipulation
- **NLP**: Natural language processing with multiple libraries
## Prerequisites
- Node.js >= 16.0.0
- Calibre installed on Windows
- ImageMagick (for enhanced image processing)
- Tesseract OCR (for text extraction from images)
## Installation
1. Clone this repository:
```bash
git clone https://github.com/yourusername/calibre-rag-mcp-nodejs.git
cd calibre-rag-mcp-nodejs
```
2. Install dependencies:
```bash
npm install
```
3. Run setup (Windows):
```bash
setup.bat
```
## Configuration
The server automatically detects your Calibre library location. For custom configurations, modify the settings in `server.js`.
## Usage
### Starting the Server
```bash
npm start
```
### Available Tools
- `search`: Semantic search across your ebook library
- `fetch`: Retrieve specific content from books
- `list_projects`: List all RAG projects
- `create_project`: Create a new RAG project
- `add_books_to_project`: Add books to a project for vectorization
- `search_project_context`: Search within specific projects
### Example MCP Configuration
Add to your MCP client configuration:
```json
{
"mcpServers": {
"calibre-rag": {
"command": "node",
"args": ["path/to/calibre-rag-mcp-nodejs/server.js"]
}
}
}
```
## Project Structure
```
calibre-rag-mcp-nodejs/
├── server.js # Main MCP server
├── package.json # Dependencies and scripts
├── setup.bat # Windows setup script
├── test-*.js # Various test files
├── projects/ # RAG projects storage
├── CONFIG.md # Configuration documentation
├── USAGE_EXAMPLES.md # Usage examples
└── QUICK_TEST.md # Quick testing guide
```
## Testing
Run the test suite:
```bash
npm test
```
Individual test files:
- `test-enhanced-server.js` - Enhanced server functionality
- `test-ocr-full.js` - OCR capabilities
- `test-pdf-approaches.js` - PDF processing
- `test-enhanced-auto.js` - Automated testing
## Documentation
- [Configuration Guide](CONFIG.md)
- [Usage Examples](USAGE_EXAMPLES.md)
- [Quick Test Guide](QUICK_TEST.md)
## Requirements
### System Requirements
- Windows 10/11
- Node.js 16+
- Calibre installed
- At least 4GB RAM (8GB+ recommended for large libraries)
### Optional Dependencies
- ImageMagick (for enhanced image processing)
- Tesseract OCR (for text extraction from scanned documents)
## Troubleshooting
### Common Issues
1. **FAISS Installation**: If FAISS fails to install, ensure you have proper build tools
2. **Tesseract Not Found**: Install Tesseract and add to PATH
3. **Memory Issues**: Reduce batch sizes for large document processing
### Debug Mode
Enable verbose logging by setting environment variable:
```bash
set DEBUG=calibre-rag:*
npm start
```
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Submit a pull request
## License
Licensed under the Apache License 2.0. See LICENSE file for details.
## Support
For issues and questions, please open an issue on GitHub.
## Changelog
### v1.0.0
- Initial release with RAG capabilities
- Project-based vector search
- Multi-format document support
- OCR integration
- Windows optimization