sample_table_data
Get a random sample of rows from any Azure Data Explorer table, with control over the number of rows returned.
Instructions
Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| table_name | Yes | ||
| sample_size | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- src/adx_mcp_server/server.py:264-284 (handler)The main tool handler function for sample_table_data. It validates the table name and sample size, executes a KQL 'sample' query on Azure Data Explorer, and formats the results.
@mcp.tool(description="Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).") async def sample_table_data(table_name: str, sample_size: int = 10) -> List[Dict[str, Any]]: """Get sample data from a table.""" table_name = validate_table_name(table_name) sample_size = validate_sample_size(sample_size) logger.info("Sampling table data", table_name=table_name, sample_size=sample_size, database=config.database) if not config.cluster_url or not config.database: logger.error("Missing ADX configuration") raise ValueError("Azure Data Explorer configuration is missing. Please set ADX_CLUSTER_URL and ADX_DATABASE environment variables.") try: client = get_kusto_client() query = f"{table_name} | sample {sample_size}" result_set = client.execute(config.database, query) results = format_query_results(result_set) logger.info("Sample data retrieved successfully", table_name=table_name, row_count=len(results)) return results except Exception as e: logger.error("Failed to sample table data", table_name=table_name, error=str(e), exception_type=type(e).__name__) raise - src/adx_mcp_server/server.py:264-264 (registration)The @mcp.tool decorator registers sample_table_data as an MCP tool with a description.
@mcp.tool(description="Retrieves a random sample of rows from the specified table in the Azure Data Explorer database. The sample_size parameter controls how many rows to return (default: 10).") - src/adx_mcp_server/server.py:193-197 (helper)Helper function validate_sample_size that validates sample_size is a positive integer.
def validate_sample_size(sample_size: int) -> int: """Validate sample_size is a positive integer.""" if not isinstance(sample_size, int) or sample_size <= 0: raise ValueError(f"sample_size must be a positive integer, got: {sample_size}") return sample_size - src/adx_mcp_server/server.py:176-191 (helper)Helper function validate_table_name that validates table names against a safe regex pattern to prevent KQL injection.
def validate_table_name(table_name: str) -> str: """Validate a KQL table name to prevent injection attacks. Allows simple identifiers (my_table) and dot-qualified names (database.table). Rejects any characters that could enable KQL injection. """ if not table_name or not table_name.strip(): raise ValueError("Table name cannot be empty") table_name = table_name.strip() if not _TABLE_NAME_PATTERN.match(table_name): raise ValueError( f"Invalid table name: '{table_name}'. " "Table names must contain only letters, digits, underscores, " "and dots (for qualified names like 'database.table')." ) return table_name - src/adx_mcp_server/server.py:139-172 (helper)Helper function format_query_results that converts Kusto query results into a list of dictionaries, used by sample_table_data.
def format_query_results(result_set) -> List[Dict[str, Any]]: """ Format Kusto query results into a list of dictionaries. Args: result_set: Raw result set from KustoClient Returns: List of dictionaries with column names as keys """ if not result_set or not result_set.primary_results: logger.debug("Empty or null result set received") return [] try: primary_result = result_set.primary_results[0] columns = [col.column_name for col in primary_result.columns] formatted_results = [] for row in primary_result.rows: record = {} for i, value in enumerate(row): record[columns[i]] = value formatted_results.append(record) logger.debug("Query results formatted", row_count=len(formatted_results), columns=columns) return formatted_results except Exception as e: logger.error( "Error formatting query results", error=str(e), exception_type=type(e).__name__ ) raise