Skip to main content
Glama
hendrickcastro

MCP CosmosDB

mcp_get_container_stats

Retrieve container statistics like document count, size, and partition key distribution for capacity planning and performance analysis.

Instructions

Get statistical information about a container including document count, estimated size, and partition key distribution. Use this for capacity planning and performance analysis. Example: mcp_get_container_stats({container_id: 'orders', sample_size: 500})

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
container_idYesThe ID/name of the container to analyze
sample_sizeNoNumber of documents to sample for statistics (default: 1000, higher = more accurate but slower)
connection_idNoID of the connection to use. Use mcp_list_connections to see available connections. If not specified, uses the default connection.

Implementation Reference

  • The handler function that executes the tool logic for mcp_get_container_stats. It counts total documents, samples documents to estimate size, and analyzes partition key distribution.
    /**
     * Get statistical information about a container (document count, size, partition distribution)
     */
    export const mcp_get_container_stats = async (args: { container_id: string; sample_size?: number; connection_id?: string }): Promise<ToolResult<ContainerStats>> => {
      const { container_id, sample_size = 1000, connection_id } = args;
      log(`Executing mcp_get_container_stats with: ${JSON.stringify(args)}`);
    
      try {
        const container = getContainer(container_id, connection_id);
        
        // Query to count total documents
        const countQuery = 'SELECT VALUE COUNT(1) FROM c';
        const { resources: countResult } = await container.items.query(countQuery).fetchAll();
        const documentCount = countResult[0] || 0;
    
        // Get partition key path for statistics
        const { resource: containerDef } = await container.read();
        if (!containerDef || !containerDef.partitionKey || !containerDef.partitionKey.paths || containerDef.partitionKey.paths.length === 0) {
          throw new Error(`Container ${container_id} does not have a valid partition key defined`);
        }
        const partitionKeyPath = containerDef.partitionKey.paths[0];
    
        // Sample documents to estimate size and analyze partitions
        const sampleQuery = `SELECT TOP ${sample_size} * FROM c`;
        const { resources: sampleDocs } = await container.items.query(sampleQuery).fetchAll();
    
        // Calculate estimated size based on sample
        let totalSampleSize = 0;
        const partitionStats: Record<string, { count: number; size: number }> = {};
    
        sampleDocs.forEach(doc => {
          const docSize = JSON.stringify(doc).length;
          totalSampleSize += docSize;
    
          // Get partition key value
          const partitionValue = getNestedProperty(doc, partitionKeyPath.substring(1)); // Remove leading '/'
          const partitionKey = String(partitionValue || 'undefined');
    
          if (!partitionStats[partitionKey]) {
            partitionStats[partitionKey] = { count: 0, size: 0 };
          }
          partitionStats[partitionKey].count++;
          partitionStats[partitionKey].size += docSize;
        });
    
        // Estimate total size
        const avgDocSize = sampleDocs.length > 0 ? totalSampleSize / sampleDocs.length : 0;
        const estimatedSizeInKB = Math.round((documentCount * avgDocSize) / 1024);
    
        // Convert partition stats
        const partitionKeyStatistics = Object.entries(partitionStats).map(([key, stats]) => ({
          partitionKeyValue: key,
          documentCount: Math.round((stats.count / sampleDocs.length) * documentCount),
          sizeInKB: Math.round(stats.size / 1024)
        }));
    
        const containerStats: ContainerStats = {
          documentCount,
          sizeInKB: estimatedSizeInKB,
          partitionKeyStatistics
        };
    
        return { success: true, data: containerStats };
      } catch (error: any) {
        log(`Error in mcp_get_container_stats for container ${container_id}: ${error.message}`);
        return { success: false, error: error.message };
      }
    };
  • The ContainerStats and PartitionKeyStats interfaces defining the return type shape for mcp_get_container_stats.
    export interface ContainerStats {
      documentCount: number;
      sizeInKB: number;
      throughput?: number;
      partitionKeyStatistics?: PartitionKeyStats[];
    }
    
    export interface PartitionKeyStats {
      partitionKeyValue: string;
      documentCount: number;
      sizeInKB: number;
    }
  • src/tools.ts:64-84 (registration)
    The tool registration definition with name, description, and inputSchema (container_id required, sample_size optional, connection_id optional).
    // 4. Container Statistics
    {
      name: "mcp_get_container_stats",
      description: "Get statistical information about a container including document count, estimated size, and partition key distribution. Use this for capacity planning and performance analysis. Example: mcp_get_container_stats({container_id: 'orders', sample_size: 500})",
      inputSchema: {
        type: "object",
        properties: {
          container_id: {
            type: "string",
            description: "The ID/name of the container to analyze"
          },
          sample_size: {
            type: "number",
            description: "Number of documents to sample for statistics (default: 1000, higher = more accurate but slower)",
            default: 1000
          },
          ...connectionIdProperty
        },
        required: ["container_id"]
      }
    },
  • Helper function getNestedProperty used to extract partition key values from nested document properties.
    // Helper function to get nested property from object
    function getNestedProperty(obj: any, path: string): any {
      return path.split('/').reduce((current, key) => {
        return current && current[key] !== undefined ? current[key] : undefined;
      }, obj);
    }
  • src/server.ts:130-132 (registration)
    The server-side routing/dispatch that maps the 'mcp_get_container_stats' tool name to the handler function.
    case 'mcp_get_container_stats':
        result = await toolHandlers.mcp_get_container_stats(input as any);
        break;
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description adds some behavioral context: it notes that higher sample_size is 'more accurate but slower'. But it omits details like read-only nature, authentication requirements, or potential impact on container performance. This is a marginal improvement over no description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences plus a helpful example. Every sentence adds necessary information with no filler or repetition. Front-loaded with purpose, then usage, then example.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description lists the statistical outputs (document count, estimated size, partition key distribution), which adequately informs about return values. It could mention the return format (JSON) but is sufficient for a tool this simple.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already covers all parameters (100% coverage). The description adds value by providing an example with sample_size=500 and an additional note on the trade-off between accuracy and speed for sample_size. This goes beyond the schema's default description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('get') and resources ('statistical information about a container') and lists concrete metrics (document count, estimated size, partition key distribution). It clearly distinguishes from sibling tools like mcp_analyze_schema or query tools by focusing on stats for capacity planning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Use this for capacity planning and performance analysis', providing a clear use case. However, it does not mention when not to use it or point to alternative tools (e.g., mcp_analyze_schema for schema details).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hendrickcastro/MCPCosmosDB'

If you have feedback or need assistance with the MCP directory API, please join our Discord server