mcp_analyze_schema
Analyzes document schemas in a CosmosDB container to identify data structure and types, using a specified sample size for efficient analysis.
Instructions
Analyze the schema of documents in a container to understand data structure and types
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| container_id | Yes | The ID of the container to analyze | |
| sample_size | No | Number of documents to sample for analysis |
Implementation Reference
- src/tools/dataOperations.ts:122-172 (handler)Core handler function for mcp_analyze_schema tool that samples documents, analyzes properties recursively using helper functions, computes statistics, and returns SchemaAnalysis.export const mcp_analyze_schema = async (args: { container_id: string; sample_size?: number; }): Promise<ToolResult<SchemaAnalysis>> => { const { container_id, sample_size = 1000 } = args; console.log('Executing mcp_analyze_schema with:', args); try { const container = getContainer(container_id); // Get sample documents const query = `SELECT TOP ${sample_size} * FROM c`; const { resources: documents } = await container.items.query(query).fetchAll(); if (documents.length === 0) { return { success: true, data: { sampleSize: 0, commonProperties: [], dataTypes: {}, nestedStructures: [] } }; } // Analyze properties const propertyStats: Record<string, { count: number; types: Set<string>; nullCount: number; examples: any[] }> = {}; const dataTypeCounts: Record<string, number> = {}; documents.forEach(doc => { analyzeObject(doc, '', propertyStats, dataTypeCounts); }); // Convert to results const commonProperties: PropertyAnalysis[] = Object.entries(propertyStats) .map(([name, stats]) => ({ name, type: Array.from(stats.types).join(' | '), frequency: stats.count / documents.length, nullCount: stats.nullCount, examples: stats.examples.slice(0, 5) })) .sort((a, b) => b.frequency - a.frequency) .slice(0, 50); // Top 50 properties const schemaAnalysis: SchemaAnalysis = { sampleSize: documents.length, commonProperties, dataTypes: dataTypeCounts, nestedStructures: [] // Could be implemented for deeper analysis }; return { success: true, data: schemaAnalysis }; } catch (error: any) { console.error(`Error in mcp_analyze_schema for container ${container_id}: ${error.message}`); return { success: false, error: error.message }; } };
- src/tools/dataOperations.ts:175-204 (helper)Recursive helper function to traverse document objects, collect property statistics and data types.function analyzeObject(obj: any, prefix: string, propertyStats: Record<string, any>, dataTypeCounts: Record<string, number>, maxDepth = 3): void { if (maxDepth <= 0 || obj === null || obj === undefined) return; Object.entries(obj).forEach(([key, value]) => { const propName = prefix ? `${prefix}.${key}` : key; const valueType = getValueType(value); // Update data type counts dataTypeCounts[valueType] = (dataTypeCounts[valueType] || 0) + 1; // Update property stats if (!propertyStats[propName]) { propertyStats[propName] = { count: 0, types: new Set(), nullCount: 0, examples: [] }; } propertyStats[propName].count++; propertyStats[propName].types.add(valueType); if (value === null || value === undefined) { propertyStats[propName].nullCount++; } else if (propertyStats[propName].examples.length < 5) { propertyStats[propName].examples.push(value); } // Recurse for objects if (valueType === 'object' && value !== null) { analyzeObject(value, propName, propertyStats, dataTypeCounts, maxDepth - 1); } }); }
- src/tools/dataOperations.ts:207-213 (helper)Utility function to determine the type of a value for schema analysis.function getValueType(value: any): string { if (value === null) return 'null'; if (value === undefined) return 'undefined'; if (Array.isArray(value)) return 'array'; if (value instanceof Date) return 'date'; return typeof value; }
- src/tools/types.ts:51-71 (schema)TypeScript interfaces defining the structure of the schema analysis output, including SchemaAnalysis, PropertyAnalysis, and NestedStructureAnalysis.export interface SchemaAnalysis { sampleSize: number; commonProperties: PropertyAnalysis[]; dataTypes: Record<string, number>; nestedStructures: NestedStructureAnalysis[]; } export interface PropertyAnalysis { name: string; type: string; frequency: number; nullCount: number; examples: any[]; } export interface NestedStructureAnalysis { path: string; type: 'object' | 'array'; frequency: number; properties?: PropertyAnalysis[]; }
- src/tools.ts:159-177 (registration)Tool registration entry in MCP_COSMOSDB_TOOLS array, providing name, description, and input schema for the mcp_analyze_schema tool.{ name: "mcp_analyze_schema", description: "Analyze the schema of documents in a container to understand data structure and types", inputSchema: { type: "object", properties: { container_id: { type: "string", description: "The ID of the container to analyze" }, sample_size: { type: "number", description: "Number of documents to sample for analysis", default: 100 } }, required: ["container_id"] } }