Skip to main content
Glama
hendrickcastro

MCP CosmosDB

mcp_analyze_schema

Analyze document schemas in Azure CosmosDB containers to understand data structure and types by sampling documents.

Instructions

Analyze the schema of documents in a container to understand data structure and types

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
container_idYesThe ID of the container to analyze
sample_sizeNoNumber of documents to sample for analysis

Implementation Reference

  • Main handler function implementing the schema analysis logic by sampling documents, analyzing properties recursively, and computing statistics.
    export const mcp_analyze_schema = async (args: { 
      container_id: string; 
      sample_size?: number; 
    }): Promise<ToolResult<SchemaAnalysis>> => {
      const { container_id, sample_size = 1000 } = args;
      console.log('Executing mcp_analyze_schema with:', args);
    
      try {
        const container = getContainer(container_id);
    
        // Get sample documents
        const query = `SELECT TOP ${sample_size} * FROM c`;
        const { resources: documents } = await container.items.query(query).fetchAll();
    
        if (documents.length === 0) {
          return { success: true, data: { sampleSize: 0, commonProperties: [], dataTypes: {}, nestedStructures: [] } };
        }
    
        // Analyze properties
        const propertyStats: Record<string, { count: number; types: Set<string>; nullCount: number; examples: any[] }> = {};
        const dataTypeCounts: Record<string, number> = {};
    
        documents.forEach(doc => {
          analyzeObject(doc, '', propertyStats, dataTypeCounts);
        });
    
        // Convert to results
        const commonProperties: PropertyAnalysis[] = Object.entries(propertyStats)
          .map(([name, stats]) => ({
            name,
            type: Array.from(stats.types).join(' | '),
            frequency: stats.count / documents.length,
            nullCount: stats.nullCount,
            examples: stats.examples.slice(0, 5)
          }))
          .sort((a, b) => b.frequency - a.frequency)
          .slice(0, 50); // Top 50 properties
    
        const schemaAnalysis: SchemaAnalysis = {
          sampleSize: documents.length,
          commonProperties,
          dataTypes: dataTypeCounts,
          nestedStructures: [] // Could be implemented for deeper analysis
        };
    
        return { success: true, data: schemaAnalysis };
      } catch (error: any) {
        console.error(`Error in mcp_analyze_schema for container ${container_id}: ${error.message}`);
        return { success: false, error: error.message };
      }
    };
  • Input JSON schema definition for the mcp_analyze_schema tool, defining parameters container_id (required) and sample_size (optional).
    {
      name: "mcp_analyze_schema",
      description: "Analyze the schema of documents in a container to understand data structure and types",
      inputSchema: {
        type: "object",
        properties: {
          container_id: {
            type: "string",
            description: "The ID of the container to analyze"
          },
          sample_size: {
            type: "number",
            description: "Number of documents to sample for analysis",
            default: 100
          }
        },
        required: ["container_id"]
      }
    }
  • Re-export of the mcp_analyze_schema handler from dataOperations.js, making it available for import as toolHandlers.
    export {
      mcp_execute_query,
      mcp_get_documents,
      mcp_get_document_by_id,
      mcp_analyze_schema
    } from './dataOperations.js';
  • src/server.ts:112-113 (registration)
    Dispatch case in server request handler that calls the mcp_analyze_schema tool handler.
    case 'mcp_analyze_schema':
        result = await toolHandlers.mcp_analyze_schema(input as any);
  • Recursive helper function to analyze object properties, track types, frequencies, and examples across sampled documents.
    function analyzeObject(obj: any, prefix: string, propertyStats: Record<string, any>, dataTypeCounts: Record<string, number>, maxDepth = 3): void {
      if (maxDepth <= 0 || obj === null || obj === undefined) return;
    
      Object.entries(obj).forEach(([key, value]) => {
        const propName = prefix ? `${prefix}.${key}` : key;
        const valueType = getValueType(value);
    
        // Update data type counts
        dataTypeCounts[valueType] = (dataTypeCounts[valueType] || 0) + 1;
    
        // Update property stats
        if (!propertyStats[propName]) {
          propertyStats[propName] = { count: 0, types: new Set(), nullCount: 0, examples: [] };
        }
    
        propertyStats[propName].count++;
        propertyStats[propName].types.add(valueType);
    
        if (value === null || value === undefined) {
          propertyStats[propName].nullCount++;
        } else if (propertyStats[propName].examples.length < 5) {
          propertyStats[propName].examples.push(value);
        }
    
        // Recurse for objects
        if (valueType === 'object' && value !== null) {
          analyzeObject(value, propName, propertyStats, dataTypeCounts, maxDepth - 1);
        }
      });
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hendrickcastro/MCPCosmosDB'

If you have feedback or need assistance with the MCP directory API, please join our Discord server