Calibre RAG MCP Server

CONFIG.md•7.93 kB

# Enhanced Calibre RAG MCP Server - Configuration Guide ## Directory Structure ``` calibre-rag-mcp-nodejs/ ├── package.json # Dependencies and project info ├── server.js # Main server implementation ├── test.js # Test suite ├── setup.bat # Windows setup script ├── README.md # Main documentation ├── USAGE_EXAMPLES.md # Usage examples ├── CONFIG.md # This configuration guide ├── node_modules/ # Installed dependencies (after npm install) └── projects/ # RAG projects directory (created automatically) ├── project_name_1/ │ ├── project.json # Project configuration │ ├── vectors.bin # Binary vector storage │ ├── metadata.json # Chunk metadata │ └── chunks/ # Individual text chunks │ ├── chunk_0.json │ ├── chunk_1.json │ └── ... └── project_name_2/ └── ... ``` ## Server Configuration ### Primary Settings (edit in server.js) ```javascript const CONFIG = { // REQUIRED: Update this to your Calibre library path CALIBRE_LIBRARY: 'D:\\e-library', // ← CHANGE THIS // Calibre executable paths (automatic detection) CALIBRE_PATHS: [ 'calibredb', // If in system PATH 'C:\\Program Files\\Calibre2\\calibredb.exe', 'C:\\Program Files (x86)\\Calibre2\\calibredb.exe', path.join(os.homedir(), 'AppData', 'Local', 'calibre-ebook', 'calibredb.exe') ], // RAG Configuration RAG: { PROJECTS_DIR: path.join(__dirname, 'projects'), EMBEDDINGS_MODEL: 'Xenova/all-MiniLM-L6-v2', // Lightweight, good quality CHUNK_SIZE: 1000, // Characters per chunk CHUNK_OVERLAP: 200, // Overlap between chunks VECTOR_DIMENSION: 384, // Embedding dimension MAX_CONTEXT_CHUNKS: 5 // Max chunks returned per query } }; ``` ### Finding Your Calibre Library Path 1. **Open Calibre** 2. **Go to:** Preferences → Interface → Set Library Location 3. **Copy the path** and update `CALIBRE_LIBRARY` in server.js Common paths: - `C:\\Users\\YourName\\Documents\\Calibre Library` - `D:\\Books\\Calibre Library` - `E:\\Library\\Calibre` ### Alternative Embedding Models For different use cases, you can change the `EMBEDDINGS_MODEL`: ```javascript // Lightweight (384 dim) - Good for general use EMBEDDINGS_MODEL: 'Xenova/all-MiniLM-L6-v2', // Better quality (768 dim) - Requires more memory EMBEDDINGS_MODEL: 'Xenova/all-mpnet-base-v2', // Multilingual (384 dim) - For non-English content EMBEDDINGS_MODEL: 'Xenova/multilingual-e5-small', // Technical/Scientific (768 dim) - Better for technical content EMBEDDINGS_MODEL: 'Xenova/scibert_scivocab_uncased', ``` ### Performance Tuning #### For Large Libraries (10,000+ books): ```javascript RAG: { CHUNK_SIZE: 800, // Smaller chunks CHUNK_OVERLAP: 150, // Less overlap MAX_CONTEXT_CHUNKS: 3 // Fewer chunks per query } ``` #### For Technical/Engineering Content: ```javascript RAG: { CHUNK_SIZE: 1200, // Larger chunks for complete formulas CHUNK_OVERLAP: 250, // More overlap for context MAX_CONTEXT_CHUNKS: 5 // More chunks for complex topics } ``` #### For Low-Memory Systems: ```javascript RAG: { EMBEDDINGS_MODEL: 'Xenova/all-MiniLM-L6-v2', // Lightest model CHUNK_SIZE: 600, // Smaller chunks MAX_CONTEXT_CHUNKS: 3 // Fewer chunks } ``` ## Project Configuration Example Each project has a `project.json` file: ```json { "name": "steel_building_design", "description": "Steel building design references including AISC standards", "created_at": "2024-09-08T10:30:00.000Z", "last_updated": "2024-09-08T11:45:00.000Z", "books": [123, 456, 789], "chunk_count": 1547, "vector_dimension": 384, "settings": { "chunk_size": 1000, "chunk_overlap": 200, "embedding_model": "Xenova/all-MiniLM-L6-v2" } } ``` ## Environment Variables (Optional) Create a `.env` file for sensitive settings: ```bash # .env file (optional) CALIBRE_LIBRARY_PATH=D:\e-library EMBEDDINGS_CACHE_DIR=C:\Users\YourName\AppData\Local\calibre-rag-cache LOG_LEVEL=info ``` Then load in server.js: ```javascript require('dotenv').config(); const CALIBRE_LIBRARY = process.env.CALIBRE_LIBRARY_PATH || 'D:\\e-library'; ``` ## Claude Integration Configuration ### MCP Client Setup Add to your Claude configuration: ```json { "mcpServers": { "calibre-rag": { "command": "node", "args": ["F:\\MCP servers\\calibre-rag-mcp-nodejs\\server.js"], "cwd": "F:\\MCP servers\\calibre-rag-mcp-nodejs" } } } ``` ### Usage Patterns 1. **Project-First Approach:** ``` Human: Create a RAG project for structural steel design Claude: [Creates project, searches books, adds to project] Human: What are the AISC requirements for beam design? Claude: [Uses project context to provide detailed answer] ``` 2. **Search-First Approach:** ``` Human: Search for books about concrete bridge design Claude: [Searches library, shows results] Human: Add books 123, 456 to a new bridge design project Claude: [Creates project and processes books] ``` ## Monitoring and Maintenance ### Log Files - Location: `%TEMP%\calibre-rag-mcp-requests.log` - Rotation: Manual (delete when too large) - Level: All requests and errors ### Storage Management ```bash # Check project sizes dir projects /s # Clean up old projects (manual) rmdir /s projects\old_project_name ``` ### Performance Monitoring - Vector file sizes (projects/*/vectors.bin) - Chunk counts per project - Response times for context queries ## Troubleshooting Configuration ### Common Issues 1. **"Calibre not found"** - Check CALIBRE_LIBRARY path - Verify Calibre installation - Add calibredb.exe to system PATH 2. **"Module not found" errors** - Run: `npm install` - Check Node.js version (16+) - Clear node_modules and reinstall 3. **Embedding download fails** - Check internet connection - Clear cache: Delete `~/.cache/huggingface` - Try different embedding model 4. **Out of memory errors** - Reduce CHUNK_SIZE - Reduce MAX_CONTEXT_CHUNKS - Use lighter embedding model 5. **Slow performance** - Check vector file sizes - Reduce chunk overlap - Limit books per project ### Debug Mode Enable detailed logging: ```javascript const DEBUG = true; // Add to CONFIG object // In log method: if (DEBUG) { console.log(logMessage); } ``` ### Validation Commands ```bash # Test Calibre connection calibredb list --limit 5 # Test Node.js and dependencies node -e "console.log(process.version)" npm list # Test MCP server node test.js ``` ## Advanced Configuration ### Custom Chunking Strategy Modify the `intelligentChunk` method for domain-specific content: ```javascript // For legal documents const legalChunk = (content) => { // Split by sections, subsections, paragraphs // Preserve legal citations // Maintain clause structure }; // For programming books const codeChunk = (content) => { // Keep code blocks intact // Preserve function definitions // Maintain import statements }; ``` ### Custom Embedding Pipeline ```javascript async initializeCustomEmbedder() { // Use domain-specific models // Add preprocessing steps // Implement custom tokenization } ``` ### Integration with External Vector Databases ```javascript // FAISS integration const faiss = require('faiss-node'); // Pinecone integration const pinecone = require('@pinecone-database/pinecone'); // Weaviate integration const weaviate = require('weaviate-ts-client'); ```

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ispyridis/calibre-rag-mcp-nodejs'

If you have feedback or need assistance with the MCP directory API, please join our Discord server