query_documents
Search your local documents using natural language queries to find relevant information from PDF, DOCX, TXT, and Markdown files stored on your machine.
Instructions
Search through previously ingested documents (PDF, DOCX, TXT, MD) using semantic search. Returns relevant passages from documents in the BASE_DIR. Documents must be ingested first using ingest_file.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Natural language search query (e.g., "transformer architecture", "API documentation") | |
| limit | No | Maximum number of results to return (default: 5, max recommended: 20) |
Implementation Reference
- src/server/index.ts:248-278 (handler)The handleQueryDocuments function implements the core logic of the query_documents tool: embeds the input query using the Embedder, performs semantic search in the VectorStore, formats the top results (with file path, chunk index, text, and score), and returns them as JSON in the MCP content format.async handleQueryDocuments( args: QueryDocumentsInput ): Promise<{ content: [{ type: 'text'; text: string }] }> { try { // Generate query embedding const queryVector = await this.embedder.embed(args.query) // Vector search const searchResults = await this.vectorStore.search(queryVector, args.limit || 5) // Format results const results: QueryResult[] = searchResults.map((result) => ({ filePath: result.filePath, chunkIndex: result.chunkIndex, text: result.text, score: result.score, })) return { content: [ { type: 'text', text: JSON.stringify(results, null, 2), }, ], } } catch (error) { console.error('Failed to query documents:', error) throw error } }
- src/server/index.ts:39-44 (schema)TypeScript interface defining the input parameters for the query_documents tool: required 'query' string and optional 'limit' number.export interface QueryDocumentsInput { /** Natural language query */ query: string /** Number of results to retrieve (default 5) */ limit?: number }
- src/server/index.ts:141-160 (registration)Registration of the query_documents tool in the ListToolsRequestHandler, including name, description, and JSON schema for input validation.name: 'query_documents', description: 'Search through previously ingested documents (PDF, DOCX, TXT, MD) using semantic search. Returns relevant passages from documents in the BASE_DIR. Documents must be ingested first using ingest_file.', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Natural language search query (e.g., "transformer architecture", "API documentation")', }, limit: { type: 'number', description: 'Maximum number of results to return (default: 5, max recommended: 20)', }, }, required: ['query'], }, },
- src/server/index.ts:213-216 (registration)Dispatch case in the CallToolRequestHandler that routes 'query_documents' calls to the handleQueryDocuments method.case 'query_documents': return await this.handleQueryDocuments( request.params.arguments as unknown as QueryDocumentsInput )
- src/server/index.ts:77-86 (schema)TypeScript interface defining the structure of individual search results returned by the query_documents tool.export interface QueryResult { /** File path */ filePath: string /** Chunk index */ chunkIndex: number /** Text */ text: string /** Similarity score */ score: number }