add_documents
Upload and process documents into a Qdrant collection using specified embedding services like OpenAI or Ollama. Define text chunk size and overlap for efficient semantic search integration.
Instructions
Add documents to a Qdrant collection with specified embedding service
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| chunkOverlap | No | Overlap between chunks (optional) | |
| chunkSize | No | Size of text chunks (optional) | |
| collection | Yes | Name of the collection to add documents to | |
| embeddingService | Yes | Embedding service to use | |
| filePath | Yes | Path to the file to process |
Implementation Reference
- src/index.ts:118-146 (registration)Registration of the 'add_documents' tool in the MCP server's list of tools, including name, description, and input schema.name: 'add_documents', description: 'Add documents to a Qdrant collection with specified embedding service', inputSchema: { type: 'object', properties: { filePath: { type: 'string', description: 'Path to the file to process', }, collection: { type: 'string', description: 'Name of the collection to add documents to', }, embeddingService: { type: 'string', enum: ['openai', 'openrouter', 'fastembed', 'ollama'], description: 'Embedding service to use', }, chunkSize: { type: 'number', description: 'Size of text chunks (optional)', }, chunkOverlap: { type: 'number', description: 'Overlap between chunks (optional)', }, }, required: ['filePath', 'collection', 'embeddingService'], },
- src/index.ts:253-318 (handler)Main MCP tool handler for 'add_documents': reads file, chunks text, generates embeddings, ensures collection exists, and delegates to Qdrant service to store vectors.private async handleAddDocuments(args: AddDocumentsArgs) { try { // Configure text processor if custom settings provided if (args.chunkSize) { this.textProcessor.setChunkSize(args.chunkSize); } if (args.chunkOverlap) { this.textProcessor.setChunkOverlap(args.chunkOverlap); } // Read and process the file const content = readFileSync(args.filePath, 'utf-8'); const chunks = await this.textProcessor.processFile(content, args.filePath); // Create embedding service const embeddingService = createEmbeddingService({ type: args.embeddingService, apiKey: process.env[`${args.embeddingService.toUpperCase()}_API_KEY`], endpoint: process.env[`${args.embeddingService.toUpperCase()}_ENDPOINT`], }); // Generate embeddings const embeddings = await embeddingService.generateEmbeddings( chunks.map(chunk => chunk.text) ); // Create collection if it doesn't exist const collections = await this.qdrantService.listCollections(); if (!collections.includes(args.collection)) { await this.qdrantService.createCollection(args.collection, embeddingService.vectorSize); } // Add documents to collection await this.qdrantService.addDocuments( args.collection, chunks.map((chunk, i) => ({ id: uuidv4(), vector: embeddings[i], payload: { text: chunk.text, ...chunk.metadata, }, })) ); return { content: [ { type: 'text', text: `Successfully processed and added ${chunks.length} chunks to collection ${args.collection}`, }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `Error adding documents: ${errorMessage}`, }, ], isError: true, }; } }
- src/index.ts:20-26 (schema)TypeScript interface defining the input arguments for the add_documents tool.interface AddDocumentsArgs { filePath: string; collection: string; embeddingService: 'openai' | 'openrouter' | 'fastembed' | 'ollama'; chunkSize?: number; chunkOverlap?: number; }
- src/services/qdrant.ts:94-138 (helper)Qdrant service helper function that performs the actual upsert of document vectors and payloads into the collection.async addDocuments( collection: string, documents: { id: string; vector: number[]; payload: Record<string, any> }[] ): Promise<void> { try { console.log('Attempting to add documents to Qdrant collection using direct fetch...'); // Use direct fetch instead of the client const upsertUrl = `${this.url}/collections/${collection}/points`; console.log(`Fetching from: ${upsertUrl}`); const points = documents.map(doc => ({ id: doc.id, vector: doc.vector, payload: doc.payload, })); const response = await fetch(upsertUrl, { method: 'PUT', headers: { 'Content-Type': 'application/json', ...(this.apiKey ? { 'api-key': this.apiKey } : {}) }, // @ts-ignore - node-fetch supports timeout timeout: 10000, // 10 second timeout for potentially larger uploads body: JSON.stringify({ points }) }); if (!response.ok) { throw new Error(`HTTP error! Status: ${response.status}`); } const data = await response.json(); console.log('Successfully added documents:', data); } catch (error) { console.error('Error in addDocuments:', error); if (error instanceof Error) { console.error(`${error.name}: ${error.message}`); console.error('Stack:', error.stack); } throw error; } }
- src/index.ts:74-85 (schema)Runtime type guard/validator for AddDocumentsArgs input in the tool handler.private isAddDocumentsArgs(args: unknown): args is AddDocumentsArgs { if (!args || typeof args !== 'object') return false; const a = args as Record<string, unknown>; return ( typeof a.filePath === 'string' && typeof a.collection === 'string' && typeof a.embeddingService === 'string' && ['openai', 'openrouter', 'fastembed', 'ollama'].includes(a.embeddingService) && (a.chunkSize === undefined || typeof a.chunkSize === 'number') && (a.chunkOverlap === undefined || typeof a.chunkOverlap === 'number') ); }