Skip to main content
Glama

query_documents

Search local documents using keyword and semantic matching to find relevant information from PDF, DOCX, TXT, and Markdown files stored on your device.

Instructions

Search ingested documents. Your query words are matched exactly (keyword search). Your query meaning is matched semantically (vector search). Preserve specific terms from the user. Add context if the query is ambiguous. Results include score (0 = most relevant, higher = less relevant).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query. Include specific terms and add context if needed.
limitNoMaximum number of results to return (default: 10). Recommended: 5 for precision, 10 for balance, 20 for broad exploration.

Implementation Reference

  • The primary handler function executing the query_documents tool. It embeds the input query, performs hybrid semantic and keyword search using the VectorStore, restores source information for raw data files, and returns formatted JSON results as MCP content.
    async handleQueryDocuments( args: QueryDocumentsInput ): Promise<{ content: [{ type: 'text'; text: string }] }> { try { // Generate query embedding const queryVector = await this.embedder.embed(args.query) // Hybrid search (vector + BM25 keyword matching) const searchResults = await this.vectorStore.search(queryVector, args.query, args.limit || 10) // Format results with source restoration for raw-data files const results: QueryResult[] = searchResults.map((result) => { const queryResult: QueryResult = { filePath: result.filePath, chunkIndex: result.chunkIndex, text: result.text, score: result.score, } // Restore source for raw-data files (ingested via ingest_data) if (isRawDataPath(result.filePath)) { const source = extractSourceFromPath(result.filePath) if (source) { queryResult.source = source } } return queryResult }) return { content: [ { type: 'text', text: JSON.stringify(results, null, 2), }, ], } } catch (error) { console.error('Failed to query documents:', error) throw error } }
  • TypeScript interface defining the input parameters for the query_documents tool: query string (required) and optional limit number.
    export interface QueryDocumentsInput { /** Natural language query */ query: string /** Number of results to retrieve (default 10) */ limit?: number }
  • TypeScript interface defining the output structure for query_documents results: filePath, chunkIndex, text, score, and optional source.
    export interface QueryResult { /** File path */ filePath: string /** Chunk index */ chunkIndex: number /** Text */ text: string /** Similarity score */ score: number /** Original source (only for raw-data files, e.g., URLs ingested via ingest_data) */ source?: string }
  • MCP tool registration in ListTools handler: defines name, detailed description, and JSON schema for input validation.
    { name: 'query_documents', description: 'Search ingested documents. Your query words are matched exactly (keyword search). Your query meaning is matched semantically (vector search). Preserve specific terms from the user. Add context if the query is ambiguous. Results include score (0 = most relevant, higher = less relevant).', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Search query. Include specific terms and add context if needed.', }, limit: { type: 'number', description: 'Maximum number of results to return (default: 10). Recommended: 5 for precision, 10 for balance, 20 for broad exploration.', }, }, required: ['query'], },
  • Tool dispatch in CallToolRequestHandler: routes query_documents calls to the handleQueryDocuments method with type-cast arguments.
    case 'query_documents': return await this.handleQueryDocuments( request.params.arguments as unknown as QueryDocumentsInput )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shinpr/mcp-local-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server