query_documents

Search local documents using keyword and semantic matching to find relevant information from PDF, DOCX, TXT, and Markdown files stored on your device.

Instructions

Search ingested documents. Your query words are matched exactly (keyword search). Your query meaning is matched semantically (vector search). Preserve specific terms from the user. Add context if the query is ambiguous. Results include score (0 = most relevant, higher = less relevant).

Input Schema

TableJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query. Include specific terms and add context if needed.
`limit`	No	Maximum number of results to return (default: 10). Recommended: 5 for precision, 10 for balance, 20 for broad exploration.

Implementation Reference

src/server/index.ts:334-376 (handler)
The primary handler function executing the query_documents tool. It embeds the input query, performs hybrid semantic and keyword search using the VectorStore, restores source information for raw data files, and returns formatted JSON results as MCP content.
async handleQueryDocuments( args: QueryDocumentsInput ): Promise<{ content: [{ type: 'text'; text: string }] }> { try { // Generate query embedding const queryVector = await this.embedder.embed(args.query) // Hybrid search (vector + BM25 keyword matching) const searchResults = await this.vectorStore.search(queryVector, args.query, args.limit || 10) // Format results with source restoration for raw-data files const results: QueryResult[] = searchResults.map((result) => { const queryResult: QueryResult = { filePath: result.filePath, chunkIndex: result.chunkIndex, text: result.text, score: result.score, } // Restore source for raw-data files (ingested via ingest_data) if (isRawDataPath(result.filePath)) { const source = extractSourceFromPath(result.filePath) if (source) { queryResult.source = source } } return queryResult }) return { content: [ { type: 'text', text: JSON.stringify(results, null, 2), }, ], } } catch (error) { console.error('Failed to query documents:', error) throw error } }
src/server/index.ts:50-55 (schema)
TypeScript interface defining the input parameters for the query_documents tool: query string (required) and optional limit number.
export interface QueryDocumentsInput { /** Natural language query */ query: string /** Number of results to retrieve (default 10) */ limit?: number }
src/server/index.ts:111-122 (schema)
TypeScript interface defining the output structure for query_documents results: filePath, chunkIndex, text, score, and optional source.
export interface QueryResult { /** File path */ filePath: string /** Chunk index */ chunkIndex: number /** Text */ text: string /** Similarity score */ score: number /** Original source (only for raw-data files, e.g., URLs ingested via ingest_data) */ source?: string }
src/server/index.ts:189-207 (registration)
MCP tool registration in ListTools handler: defines name, detailed description, and JSON schema for input validation.
{ name: 'query_documents', description: 'Search ingested documents. Your query words are matched exactly (keyword search). Your query meaning is matched semantically (vector search). Preserve specific terms from the user. Add context if the query is ambiguous. Results include score (0 = most relevant, higher = less relevant).', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Search query. Include specific terms and add context if needed.', }, limit: { type: 'number', description: 'Maximum number of results to return (default: 10). Recommended: 5 for precision, 10 for balance, 20 for broad exploration.', }, }, required: ['query'], },
src/server/index.ts:296-299 (registration)
Tool dispatch in CallToolRequestHandler: routes query_documents calls to the handleQueryDocuments method with type-cast arguments.
case 'query_documents': return await this.handleQueryDocuments( request.params.arguments as unknown as QueryDocumentsInput )

Local RAG

query_documents

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API