Skip to main content
Glama

Open Search MCP

by flyanima

pdf_discovery

Search and list PDF documents quickly by query, source, and type without full processing. Streamline discovery of academic, reports, manuals, and more.

Instructions

Discover PDF documents without full processing - fast PDF search and listing

Input Schema

NameRequiredDescriptionDefault
documentTypeNoType of documents to search for
maxResultsNoMaximum number of results to return (default: 20)
queryYesSearch query for PDF document discovery
sourcesNoSources to search (arxiv, pubmed, web, all)

Input Schema (JSON Schema)

{ "properties": { "documentType": { "description": "Type of documents to search for", "enum": [ "academic", "report", "manual", "any" ], "type": "string" }, "maxResults": { "description": "Maximum number of results to return (default: 20)", "type": "number" }, "query": { "description": "Search query for PDF document discovery", "type": "string" }, "sources": { "description": "Sources to search (arxiv, pubmed, web, all)", "items": { "type": "string" }, "type": "array" } }, "required": [ "query" ], "type": "object" }

Implementation Reference

  • The main handler function for the pdf_discovery tool. It searches for PDF documents using PDFProcessor.searchPDFs based on the query and options, returning a list of discovered PDFs with metadata like title, URL, relevance score, without full processing.
    async function pdfDiscovery(args: ToolInput): Promise<ToolOutput> { const { query, maxResults = 20, sources = ['all'], documentType = 'any' } = args; try { logger.info(`Starting PDF discovery for: ${query}`); if (!query || typeof query !== 'string') { throw new Error('Query parameter is required and must be a string'); } const pdfProcessor = new PDFProcessor(); const searchOptions: PDFSearchOptions = { query, maxDocuments: maxResults, documentType: documentType as any, includeOCR: false, // Discovery doesn't need OCR sources: Array.isArray(sources) ? sources : [sources] }; const pdfDocuments = await pdfProcessor.searchPDFs(searchOptions); const result: ToolOutput = { success: true, data: { query, documents: pdfDocuments.map(doc => ({ id: doc.id, title: doc.title, url: doc.url, source: doc.source, relevanceScore: doc.relevanceScore, downloadUrl: doc.downloadUrl, fileSize: doc.fileSize })), totalFound: pdfDocuments.length, searchOptions: { documentType, sources: searchOptions.sources }, searchedAt: new Date().toISOString() }, metadata: { sources: ['pdf-discovery'], cached: false } }; logger.info(`PDF discovery completed: ${pdfDocuments.length} documents found for ${query}`); return result; } catch (error) { logger.error(`Failed PDF discovery for ${query}:`, error); return { success: false, error: `Failed to discover PDFs: ${error instanceof Error ? error.message : 'Unknown error'}`, data: null, metadata: { sources: ['pdf-discovery'], cached: false } }; } }
  • Creates the pdf_discovery tool using createTool, sets its inputSchema, and registers it with the tool registry via registry.registerTool.
    const pdfDiscoveryTool = createTool( 'pdf_discovery', 'Discover PDF documents without full processing - fast PDF search and listing', 'pdf', 'pdf-discovery', pdfDiscovery, { cacheTTL: 1800, // 30 minutes cache rateLimit: 15, // 15 requests per minute requiredParams: ['query'], optionalParams: ['maxResults', 'sources', 'documentType'] } ); pdfDiscoveryTool.inputSchema = { type: 'object', properties: { query: { type: 'string', description: 'Search query for PDF document discovery' }, maxResults: { type: 'number', description: 'Maximum number of results to return (default: 20)' }, sources: { type: 'array', items: { type: 'string' }, description: 'Sources to search (arxiv, pubmed, web, all)' }, documentType: { type: 'string', description: 'Type of documents to search for', enum: ['academic', 'report', 'manual', 'any'] } }, required: ['query'] }; registry.registerTool(pdfDiscoveryTool);
  • Input schema for the pdf_discovery tool defining the expected parameters: query (required), maxResults, sources, documentType.
    pdfDiscoveryTool.inputSchema = { type: 'object', properties: { query: { type: 'string', description: 'Search query for PDF document discovery' }, maxResults: { type: 'number', description: 'Maximum number of results to return (default: 20)' }, sources: { type: 'array', items: { type: 'string' }, description: 'Sources to search (arxiv, pubmed, web, all)' }, documentType: { type: 'string', description: 'Type of documents to search for', enum: ['academic', 'report', 'manual', 'any'] } }, required: ['query'] };

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/flyanima/open-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server