get_patent_statistics
Analyze chemical content statistics for patents to understand composition and annotations in the SureChEMBL database.
Instructions
Get statistical overview of chemical content in patents
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| document_id | Yes | Patent document ID for statistics | |
| include_annotations | No | Include detailed annotation statistics (default: true) |
Implementation Reference
- src/index.ts:523-534 (registration)Tool registration entry in the tools list, including name, description, and input schema definition.{ name: 'get_patent_statistics', description: 'Get statistical overview of chemical content in patents', inputSchema: { type: 'object', properties: { document_id: { type: 'string', description: 'Patent document ID for statistics' }, include_annotations: { type: 'boolean', description: 'Include detailed annotation statistics (default: true)' }, }, required: ['document_id'], }, },
- src/index.ts:576-577 (registration)Dispatch case in the CallToolRequestSchema handler that routes tool calls to the handler method.case 'get_patent_statistics': return await this.handleGetPatentStatistics(args);
- src/index.ts:1120-1229 (handler)The core handler function that implements the tool logic: validates input, fetches patent document from SureChEMBL API, parses annotations from abstracts and descriptions, computes comprehensive statistics on chemical content, and returns formatted JSON results.private async handleGetPatentStatistics(args: any) { if (!isValidDocumentArgs(args)) { throw new McpError(ErrorCode.InvalidParams, 'Invalid document arguments'); } try { const response = await this.apiClient.get(`/document/${args.document_id}/contents`); const document = response.data.data; if (!document) { throw new Error('Document not found'); } const includeAnnotations = args.include_annotations !== false; // Extract basic document information const docInfo = document.contents?.patentDocument?.bibliographicData; const abstracts = document.contents?.patentDocument?.abstracts || []; const descriptions = document.contents?.patentDocument?.descriptions || []; // Collect all chemical annotations const allAnnotations: any[] = []; abstracts.forEach((abstract: any) => { if (abstract.section?.annotations) { abstract.section.annotations.forEach((annotation: any) => { allAnnotations.push({ ...annotation, source: 'abstract', language: abstract.lang }); }); } }); descriptions.forEach((description: any) => { if (description.section?.annotations) { description.section.annotations.forEach((annotation: any) => { allAnnotations.push({ ...annotation, source: 'description', language: description.lang }); }); } }); // Calculate statistics const chemicalAnnotations = allAnnotations.filter(a => a.category === 'chemical'); const uniqueChemicals = [...new Set(chemicalAnnotations.map(a => a.name))]; const chemicalFrequencies = chemicalAnnotations.reduce((acc: any, annotation: any) => { acc[annotation.name] = (acc[annotation.name] || 0) + 1; return acc; }, {}); const statistics = { document_id: args.document_id, document_info: { title: docInfo?.inventionTitles?.find((t: any) => t.lang === 'EN')?.title || 'N/A', publication_number: docInfo?.publicationReference?.[0]?.ucid || 'N/A', publication_date: docInfo?.publicationReference?.[0]?.documentId?.[0]?.date || 'N/A' }, content_statistics: { total_sections: abstracts.length + descriptions.length, abstract_sections: abstracts.length, description_sections: descriptions.length, languages: [...new Set([...abstracts, ...descriptions].map((s: any) => s.lang))] }, chemical_statistics: { total_chemical_annotations: chemicalAnnotations.length, unique_chemicals_count: uniqueChemicals.length, most_frequent_chemicals: Object.entries(chemicalFrequencies) .sort(([,a], [,b]) => (b as number) - (a as number)) .slice(0, 10) .map(([name, count]) => ({ name, count })), annotation_sources: { abstract: chemicalAnnotations.filter(a => a.source === 'abstract').length, description: chemicalAnnotations.filter(a => a.source === 'description').length } }, annotation_categories: { chemical: chemicalAnnotations.length, other: allAnnotations.length - chemicalAnnotations.length, total: allAnnotations.length } }; if (includeAnnotations) { (statistics as any).detailed_annotations = { chemical_annotations: chemicalAnnotations, unique_chemicals: uniqueChemicals, chemical_frequencies: chemicalFrequencies }; } return { content: [ { type: 'text', text: JSON.stringify(statistics, null, 2), }, ], }; } catch (error) { throw new McpError( ErrorCode.InternalError, `Failed to get patent statistics: ${error instanceof Error ? error.message : 'Unknown error'}` ); } }
- src/index.ts:104-114 (schema)Type guard function for validating input arguments matching the tool's input schema (document_id required, include_annotations optional). Used in the handler.const isValidDocumentArgs = ( args: any ): args is { document_id: string; include_annotations?: boolean } => { return ( typeof args === 'object' && args !== null && typeof args.document_id === 'string' && args.document_id.length > 0 && (args.include_annotations === undefined || typeof args.include_annotations === 'boolean') ); };