MCP CosmosDB

mcp_get_container_stats

Retrieve container statistics like document count, size, and partition key distribution for capacity planning and performance analysis.

Instructions

Get statistical information about a container including document count, estimated size, and partition key distribution. Use this for capacity planning and performance analysis. Example: mcp_get_container_stats({container_id: 'orders', sample_size: 500})

Input Schema

TableJSON Schema

Name	Required	Description
`container_id`	Yes	The ID/name of the container to analyze
`sample_size`	No	Number of documents to sample for statistics (default: 1000, higher = more accurate but slower)
`connection_id`	No	ID of the connection to use. Use mcp_list_connections to see available connections. If not specified, uses the default connection.

Implementation Reference

src/tools/containerAnalysis.ts:126-193 (handler)

The handler function that executes the tool logic for mcp_get_container_stats. It counts total documents, samples documents to estimate size, and analyzes partition key distribution.

/**
 * Get statistical information about a container (document count, size, partition distribution)
 */
export const mcp_get_container_stats = async (args: { container_id: string; sample_size?: number; connection_id?: string }): Promise<ToolResult<ContainerStats>> => {
  const { container_id, sample_size = 1000, connection_id } = args;
  log(`Executing mcp_get_container_stats with: ${JSON.stringify(args)}`);

  try {
    const container = getContainer(container_id, connection_id);
    
    // Query to count total documents
    const countQuery = 'SELECT VALUE COUNT(1) FROM c';
    const { resources: countResult } = await container.items.query(countQuery).fetchAll();
    const documentCount = countResult[0] || 0;

    // Get partition key path for statistics
    const { resource: containerDef } = await container.read();
    if (!containerDef || !containerDef.partitionKey || !containerDef.partitionKey.paths || containerDef.partitionKey.paths.length === 0) {
      throw new Error(`Container ${container_id} does not have a valid partition key defined`);
    }
    const partitionKeyPath = containerDef.partitionKey.paths[0];

    // Sample documents to estimate size and analyze partitions
    const sampleQuery = `SELECT TOP ${sample_size} * FROM c`;
    const { resources: sampleDocs } = await container.items.query(sampleQuery).fetchAll();

    // Calculate estimated size based on sample
    let totalSampleSize = 0;
    const partitionStats: Record<string, { count: number; size: number }> = {};

    sampleDocs.forEach(doc => {
      const docSize = JSON.stringify(doc).length;
      totalSampleSize += docSize;

      // Get partition key value
      const partitionValue = getNestedProperty(doc, partitionKeyPath.substring(1)); // Remove leading '/'
      const partitionKey = String(partitionValue || 'undefined');

      if (!partitionStats[partitionKey]) {
        partitionStats[partitionKey] = { count: 0, size: 0 };
      }
      partitionStats[partitionKey].count++;
      partitionStats[partitionKey].size += docSize;
    });

    // Estimate total size
    const avgDocSize = sampleDocs.length > 0 ? totalSampleSize / sampleDocs.length : 0;
    const estimatedSizeInKB = Math.round((documentCount * avgDocSize) / 1024);

    // Convert partition stats
    const partitionKeyStatistics = Object.entries(partitionStats).map(([key, stats]) => ({
      partitionKeyValue: key,
      documentCount: Math.round((stats.count / sampleDocs.length) * documentCount),
      sizeInKB: Math.round(stats.size / 1024)
    }));

    const containerStats: ContainerStats = {
      documentCount,
      sizeInKB: estimatedSizeInKB,
      partitionKeyStatistics
    };

    return { success: true, data: containerStats };
  } catch (error: any) {
    log(`Error in mcp_get_container_stats for container ${container_id}: ${error.message}`);
    return { success: false, error: error.message };
  }
};

src/tools/types.ts:38-49 (schema)

The ContainerStats and PartitionKeyStats interfaces defining the return type shape for mcp_get_container_stats.

export interface ContainerStats {
  documentCount: number;
  sizeInKB: number;
  throughput?: number;
  partitionKeyStatistics?: PartitionKeyStats[];
}

export interface PartitionKeyStats {
  partitionKeyValue: string;
  documentCount: number;
  sizeInKB: number;
}

src/tools.ts:64-84 (registration)

The tool registration definition with name, description, and inputSchema (container_id required, sample_size optional, connection_id optional).

// 4. Container Statistics
{
  name: "mcp_get_container_stats",
  description: "Get statistical information about a container including document count, estimated size, and partition key distribution. Use this for capacity planning and performance analysis. Example: mcp_get_container_stats({container_id: 'orders', sample_size: 500})",
  inputSchema: {
    type: "object",
    properties: {
      container_id: {
        type: "string",
        description: "The ID/name of the container to analyze"
      },
      sample_size: {
        type: "number",
        description: "Number of documents to sample for statistics (default: 1000, higher = more accurate but slower)",
        default: 1000
      },
      ...connectionIdProperty
    },
    required: ["container_id"]
  }
},

src/tools/containerAnalysis.ts:195-201 (handler)

Helper function getNestedProperty used to extract partition key values from nested document properties.

// Helper function to get nested property from object
function getNestedProperty(obj: any, path: string): any {
  return path.split('/').reduce((current, key) => {
    return current && current[key] !== undefined ? current[key] : undefined;
  }, obj);
}

src/server.ts:130-132 (registration)
The server-side routing/dispatch that maps the 'mcp_get_container_stats' tool name to the handler function.
```
case 'mcp_get_container_stats':
    result = await toolHandlers.mcp_get_container_stats(input as any);
    break;
```

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description adds some behavioral context: it notes that higher sample_size is 'more accurate but slower'. But it omits details like read-only nature, authentication requirements, or potential impact on container performance. This is a marginal improvement over no description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences plus a helpful example. Every sentence adds necessary information with no filler or repetition. Front-loaded with purpose, then usage, then example.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description lists the statistical outputs (document count, estimated size, partition key distribution), which adequately informs about return values. It could mention the return format (JSON) but is sufficient for a tool this simple.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already covers all parameters (100% coverage). The description adds value by providing an example with sample_size=500 and an additional note on the trade-off between accuracy and speed for sample_size. This goes beyond the schema's default description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('get') and resources ('statistical information about a container') and lists concrete metrics (document count, estimated size, partition key distribution). It clearly distinguishes from sibling tools like mcp_analyze_schema or query tools by focusing on stats for capacity planning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'Use this for capacity planning and performance analysis', providing a clear use case. However, it does not mention when not to use it or point to alternative tools (e.g., mcp_analyze_schema for schema details).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hendrickcastro/MCPCosmosDB'

If you have feedback or need assistance with the MCP directory API, please join our Discord server