Skip to main content
Glama
pab1it0

adx-mcp-server

sample_table_data

Get a random sample of rows from any Azure Data Explorer table, with control over the number of rows returned.

Instructions

Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
table_nameYes
sample_sizeNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The main tool handler function for sample_table_data. It validates the table name and sample size, executes a KQL 'sample' query on Azure Data Explorer, and formats the results.
    @mcp.tool(description="Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).")
    async def sample_table_data(table_name: str, sample_size: int = 10) -> List[Dict[str, Any]]:
        """Get sample data from a table."""
        table_name = validate_table_name(table_name)
        sample_size = validate_sample_size(sample_size)
        logger.info("Sampling table data", table_name=table_name, sample_size=sample_size, database=config.database)
    
        if not config.cluster_url or not config.database:
            logger.error("Missing ADX configuration")
            raise ValueError("Azure Data Explorer configuration is missing. Please set ADX_CLUSTER_URL and ADX_DATABASE environment variables.")
    
        try:
            client = get_kusto_client()
            query = f"{table_name} | sample {sample_size}"
            result_set = client.execute(config.database, query)
            results = format_query_results(result_set)
            logger.info("Sample data retrieved successfully", table_name=table_name, row_count=len(results))
            return results
        except Exception as e:
            logger.error("Failed to sample table data", table_name=table_name, error=str(e), exception_type=type(e).__name__)
            raise
  • The @mcp.tool decorator registers sample_table_data as an MCP tool with a description.
    @mcp.tool(description="Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).")
  • Helper function validate_sample_size that validates sample_size is a positive integer.
    def validate_sample_size(sample_size: int) -> int:
        """Validate sample_size is a positive integer."""
        if not isinstance(sample_size, int) or sample_size <= 0:
            raise ValueError(f"sample_size must be a positive integer, got: {sample_size}")
        return sample_size
  • Helper function validate_table_name that validates table names against a safe regex pattern to prevent KQL injection.
    def validate_table_name(table_name: str) -> str:
        """Validate a KQL table name to prevent injection attacks.
    
        Allows simple identifiers (my_table) and dot-qualified names (database.table).
        Rejects any characters that could enable KQL injection.
        """
        if not table_name or not table_name.strip():
            raise ValueError("Table name cannot be empty")
        table_name = table_name.strip()
        if not _TABLE_NAME_PATTERN.match(table_name):
            raise ValueError(
                f"Invalid table name: '{table_name}'. "
                "Table names must contain only letters, digits, underscores, "
                "and dots (for qualified names like 'database.table')."
            )
        return table_name
  • Helper function format_query_results that converts Kusto query results into a list of dictionaries, used by sample_table_data.
    def format_query_results(result_set) -> List[Dict[str, Any]]:
        """
        Format Kusto query results into a list of dictionaries.
    
        Args:
            result_set: Raw result set from KustoClient
    
        Returns:
            List of dictionaries with column names as keys
        """
        if not result_set or not result_set.primary_results:
            logger.debug("Empty or null result set received")
            return []
    
        try:
            primary_result = result_set.primary_results[0]
            columns = [col.column_name for col in primary_result.columns]
    
            formatted_results = []
            for row in primary_result.rows:
                record = {}
                for i, value in enumerate(row):
                    record[columns[i]] = value
                formatted_results.append(record)
    
            logger.debug("Query results formatted", row_count=len(formatted_results), columns=columns)
            return formatted_results
        except Exception as e:
            logger.error(
                "Error formatting query results",
                error=str(e),
                exception_type=type(e).__name__
            )
            raise
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavior but only mentions sampling and sample_size default. It does not state whether the operation is read-only, the nature of randomness, or implications for large tables. Minimal transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no redundant words. Efficiently communicates core function and key parameter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description need not detail return values, but lacks context like the source database name (implied in description), column selection, or ordering. Sufficient for a simple sampling tool but minimal.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% so description must compensate. It explains sample_size (default 10) but provides no meaning for table_name beyond its existence. Half the parameters are undocumented in meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves a random sample of rows from a specified table in Azure Data Explorer. It uses a specific verb and resource, and implicitly distinguishes from sibling tools like list_tables or execute_query by specifying sampling behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when a random sample is needed, but provides no explicit guidance on when to use this tool versus alternatives (e.g., execute_query for custom queries). No when-not or context about prerequisites is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pab1it0/adx-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server