Skip to main content
Glama
ispyridis

Calibre RAG MCP Server

by ispyridis

add_books_to_project

Add books to a RAG project for vectorization and context search, enabling semantic search and retrieval from your Calibre ebook library.

Instructions

Add books to a RAG project for vectorization and context search

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_nameYesName of the project
book_idsYesArray of book IDs to add to the project

Implementation Reference

  • The core handler function that processes books for a RAG project: fetches metadata, extracts content (txt or OCR), intelligently chunks it, generates embeddings, saves vectors and metadata, and updates project config.
    async addBooksToProject(projectName, bookIds) {
        const project = this.projects.get(projectName);
        if (!project) {
            throw new Error(`Project '${projectName}' not found`);
        }
        
        const projectPath = path.join(CONFIG.RAG.PROJECTS_DIR, projectName);
        const chunksPath = path.join(projectPath, 'chunks');
        const vectorsPath = path.join(projectPath, 'vectors.bin');
        const metadataPath = path.join(projectPath, 'metadata.json');
        
        // Get book metadata
        const idQuery = `id:${bookIds.join(' OR id:')}`;
        const listResult = await this.runCalibreCommand([
            'list',
            '--fields', 'id,title,authors,formats',
            '--for-machine',
            '--search', idQuery
        ]);
        
        const books = JSON.parse(listResult || '[]');
        const allChunks = [];
        const allVectors = [];
        const allMetadata = [];
        
        for (const book of books) {
            this.log(`Processing book: ${book.title}`);
            
            let content = '';
            let contentSource = 'unknown';
            
            // Try text format first
            const txtPath = book.formats?.find(f => f.endsWith('.txt'));
            if (txtPath && fs.existsSync(txtPath)) {
                this.log(`Using text format: ${txtPath}`);
                content = fs.readFileSync(txtPath, 'utf8');
                contentSource = 'text';
            } else {
                // Fallback to OCR
                this.log(`No text format available for: ${book.title}, trying OCR...`);
                content = await this.processBookWithOCR(book);
                contentSource = 'ocr';
                
                if (!content || content.trim().length === 0) {
                    this.log(`OCR failed or no content extracted for: ${book.title}`);
                    continue;
                }
            }
            
            // Read and chunk content
            const bookMetadata = {
                book_id: book.id,
                title: book.title,
                authors: book.authors,
                project: projectName,
                content_source: contentSource,
                content_length: content.length
            };
            
            const chunks = this.intelligentChunk(content, bookMetadata);
            
            // Generate embeddings for chunks
            for (const chunk of chunks) {
                try {
                    const embedding = await this.generateEmbedding(chunk.text);
                    
                    allChunks.push(chunk);
                    allVectors.push(embedding);
                    allMetadata.push(chunk.metadata);
                    
                    // Save individual chunk
                    fs.writeFileSync(
                        path.join(chunksPath, `${chunk.id}.json`),
                        JSON.stringify(chunk, null, 2)
                    );
                    
                    this.log(`Processed chunk ${chunk.id} from ${book.title}`);
                } catch (error) {
                    this.log(`Failed to process chunk ${chunk.id}: ${error.message}`);
                }
            }
        }
        
        // Save vectors and metadata
        if (allVectors.length > 0) {
            this.saveVectors(vectorsPath, allVectors);
            fs.writeFileSync(metadataPath, JSON.stringify(allMetadata, null, 2));
            
            // Update project config
            project.books = [...new Set([...project.books, ...bookIds])];
            project.chunk_count = allChunks.length;
            project.last_updated = new Date().toISOString();
            
            fs.writeFileSync(
                path.join(projectPath, 'project.json'),
                JSON.stringify(project, null, 2)
            );
            
            this.projects.set(projectName, project);
        }
        
        return {
            processed_books: books.length,
            total_chunks: allChunks.length,
            project: project
        };
    }
  • JSON schema defining the input parameters for the 'add_books_to_project' tool: project_name (string) and book_ids (array of integers).
    inputSchema: {
        type: 'object',
        properties: {
            project_name: {
                type: 'string',
                description: 'Name of the project'
            },
            book_ids: {
                type: 'array',
                items: { type: 'integer' },
                description: 'Array of book IDs to add to the project'
            }
        },
        required: ['project_name', 'book_ids']
    }
  • server.js:1165-1186 (registration)
    Registration and dispatch logic in the tools/call handler: validates arguments, calls the addBooksToProject method, and sends success/error responses.
    case 'add_books_to_project':
        const projName = args.project_name;
        const bookIds = args.book_ids;
        
        if (!projName || !bookIds) {
            this.sendError(id, -32602, 'Missing required parameters: project_name, book_ids');
            return;
        }
        
        try {
            const result = await this.addBooksToProject(projName, bookIds);
            this.sendSuccess(id, {
                content: [{
                    type: 'text',
                    text: `Successfully processed ${result.processed_books} books and created ${result.total_chunks} chunks for project '${projName}'.\n\nProject is now ready for context search!`
                }],
                result: result
            });
        } catch (error) {
            this.sendError(id, -32603, error.message);
        }
        break;
  • server.js:1030-1048 (registration)
    Tool registration in the tools/list response, including name, description, and input schema.
    {
        name: 'add_books_to_project',
        description: 'Add books to a RAG project for vectorization and context search',
        inputSchema: {
            type: 'object',
            properties: {
                project_name: {
                    type: 'string',
                    description: 'Name of the project'
                },
                book_ids: {
                    type: 'array',
                    items: { type: 'integer' },
                    description: 'Array of book IDs to add to the project'
                }
            },
            required: ['project_name', 'book_ids']
        }
    },
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool adds books for vectorization and context search, implying a write operation, but doesn't cover critical aspects like permissions required, whether this is idempotent, rate limits, error handling, or what happens if books are already in the project. The description is minimal and misses key behavioral traits for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core action and purpose without wasted words. It's appropriately sized for the tool's complexity, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no annotations, no output schema, and involves a mutation (adding books), the description is incomplete. It lacks details on behavioral traits, return values, error conditions, and integration with siblings. For a tool that modifies data, more context is needed to ensure safe and correct usage by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters ('project_name' and 'book_ids') adequately. The description adds no additional meaning beyond what the schema provides, such as explaining what 'book IDs' refer to or constraints on 'project_name'. Baseline score of 3 is appropriate as the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Add books') and the resource ('to a RAG project'), specifying the purpose for vectorization and context search. It distinguishes from siblings like 'create_project' or 'get_project_info' by focusing on adding existing books rather than creating projects or fetching information. However, it doesn't explicitly differentiate from potential similar tools like 'search_project_context' in terms of scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., whether the project must exist, if books need to be available), exclusions, or comparisons to siblings like 'search' or 'fetch'. Usage is implied by the action but lacks explicit context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ispyridis/calibre-rag-mcp-nodejs'

If you have feedback or need assistance with the MCP directory API, please join our Discord server