Skip to main content
Glama

analyze_columns

Analyze dataset columns to reveal statistical insights, value distributions, and numeric metrics for up to 10 columns simultaneously.

Instructions

Analyzes specific columns in a dataset by running statistical queries and returning computed metrics. This tool allows users to get statistical information about a specific column, including value distribution, top values, and numeric statistics (for numeric columns). Supports analyzing up to 10 columns at once by specifying an array of column names in the 'columns' parameter. When multiple columns are specified, they will be analyzed together as a group, showing the distribution of their combined values. Use this tool before running queries to get a better understanding of the data in your dataset.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
environmentYesThe Honeycomb environment containing the dataset
datasetYesThe dataset containing the column to analyze
columnsYesThe names of the columns to analyze
timeRangeNoTime range in seconds to analyze. Default is 2 hours.

Implementation Reference

  • The main handler function for the 'analyze_columns' tool. It validates inputs, calls the API to perform column analysis, processes the results to compute statistics, top values, cardinality, and returns a formatted JSON response.
    handler: async (params: z.infer<typeof ColumnAnalysisSchema>) => { try { // Validate required parameters if (!params.environment) { throw new Error("Missing required parameter: environment"); } if (!params.dataset) { throw new Error("Missing required parameter: dataset"); } if (!params.columns || params.columns.length === 0) { throw new Error("Missing required parameter: columns"); } if (params.columns.length > 10) { throw new Error("Too many columns requested. Maximum is 10."); } // Execute the analysis via the API const result = await api.analyzeColumns(params.environment, params.dataset, params); // Initialize the response const simplifiedResponse: SimplifiedColumnAnalysis = { columns: params.columns, count: result.data?.results?.length || 0, totalEvents: 0, // Will be populated below if available }; // Add top values if we have results if (result.data?.results && result.data.results.length > 0) { const results = result.data.results as QueryResultValue[]; const firstResult = results[0]; try { // Calculate total events across all results const totalCount = results.reduce((sum, row) => { const count = row.COUNT as number | undefined; // Only add if it's a number, otherwise use 0 return sum + (typeof count === 'number' ? count : 0); }, 0); simplifiedResponse.totalEvents = totalCount; // Add top values with their counts and percentages simplifiedResponse.topValues = results.map(row => { // For multi-column analysis, combine values into a descriptive string const combinedValue = params.columns .map(col => { const colValue = row[col] !== undefined ? row[col] : null; return `${col}: ${colValue}`; }) .join(', '); const count = typeof row.COUNT === 'number' ? row.COUNT : 0; return { value: combinedValue, count, percentage: totalCount > 0 ? ((count / totalCount) * 100).toFixed(2) + '%' : '0%' }; }); // Initialize stats container for each numeric column const numericStats: Record<string, NumericStatsWithInterpretation> = {}; // Process numeric metrics for each column if available if (firstResult) { params.columns.forEach(column => { // Check if we have numeric metrics for this column const avgKey = `AVG(${column})`; if (avgKey in firstResult) { const stats: NumericStatistics = {}; // Extract metrics for this column if (typeof firstResult[avgKey] === 'number') stats.avg = firstResult[avgKey] as number; if (typeof firstResult[`P95(${column})`] === 'number') stats.p95 = firstResult[`P95(${column})`] as number; if (typeof firstResult[`MAX(${column})`] === 'number') stats.max = firstResult[`MAX(${column})`] as number; if (typeof firstResult[`MIN(${column})`] === 'number') stats.min = firstResult[`MIN(${column})`] as number; // Calculate range if we have min and max if (stats.min !== undefined && stats.max !== undefined) { stats.range = stats.max - stats.min; } // Only add if we have at least one stat if (Object.keys(stats).length > 0) { numericStats[column] = { ...stats, interpretation: generateInterpretation(stats, column) } as NumericStatsWithInterpretation; } } }); } // Add stats if we have any if (Object.keys(numericStats).length > 0) { simplifiedResponse.stats = numericStats; } // Add cardinality information (unique combinations of values) const uniqueValueCombinations = new Set(); results.forEach(row => { const combinationKey = params.columns .map(col => `${col}:${row[col] !== undefined ? row[col] : 'null'}`) .join('|'); uniqueValueCombinations.add(combinationKey); }); const uniqueCount = uniqueValueCombinations.size; simplifiedResponse.cardinality = { uniqueCount, classification: getCardinalityClassification(uniqueCount) }; } catch (processingError) { // Handle errors during result processing, but still return partial results console.error("Error processing column analysis results:", processingError); simplifiedResponse.processingError = `Error processing results: ${processingError instanceof Error ? processingError.message : String(processingError)}`; } } return { content: [ { type: "text", text: JSON.stringify(simplifiedResponse, null, 2), }, ], }; } catch (error) { return handleToolError(error, "analyze_columns"); } }
  • Zod schema defining the input parameters for the analyze_columns tool: environment, dataset, columns (array 1-10), optional timeRange.
    export const ColumnAnalysisSchema = z.object({ environment: z.string().min(1).trim().describe("The Honeycomb environment containing the dataset"), dataset: z.string().min(1).trim().describe("The dataset containing the column to analyze"), columns: z.array(z.string()).min(1).max(10).describe("The names of the columns to analyze"), timeRange: z.number().positive().optional().describe("Time range in seconds to analyze. Default is 2 hours."), });
  • The registerTools function imports and creates the analyze_columns tool (line 33), adds it to the tools array, and registers it with the MCP server in the loop, including specific validation for analyze_columns.
    export function registerTools(server: McpServer, api: HoneycombAPI) { const tools = [ // Dataset tools createListDatasetsTool(api), createListColumnsTool(api), // Query tools createRunQueryTool(api), createAnalyzeColumnsTool(api), // Board tools createListBoardsTool(api), createGetBoardTool(api), // Marker tools createListMarkersTool(api), // Recipient tools createListRecipientsTool(api), // SLO tools createListSLOsTool(api), createGetSLOTool(api), // Trigger tools createListTriggersTool(api), createGetTriggerTool(api), // Trace tools createTraceDeepLinkTool(api), // Instrumentation tools createInstrumentationGuidanceTool(api) ]; // Register each tool with the server for (const tool of tools) { // Register the tool with the server using type assertion to bypass TypeScript's strict type checking (server as any).tool( tool.name, tool.description, tool.schema, async (args: Record<string, any>, extra: any) => { try { // Validate and ensure required fields are present before passing to handler if (tool.name.includes("analyze_columns") && (!args.environment || !args.dataset || !args.columns)) { throw new Error("Missing required fields: environment, dataset, and columns are required"); } else if (tool.name.includes("run_query") && (!args.environment || !args.dataset)) { throw new Error("Missing required fields: environment and dataset are required"); } // Use type assertion to satisfy TypeScript's type checking const result = await tool.handler(args as any); // If the result already has the expected format, return it directly if (result && typeof result === 'object' && 'content' in result) { return result as any; } // Otherwise, format the result as expected by the SDK return { content: [ { type: "text", text: typeof result === 'string' ? result : JSON.stringify(result, null, 2), }, ], } as any; } catch (error) { // Format errors to match the SDK's expected format return { content: [ { type: "text", text: error instanceof Error ? error.message : String(error), }, ], isError: true, } as any; } } ); } }
  • HoneycombAPI method called by the tool handler to execute the actual column analysis query on the Honeycomb API, building the AnalysisQuery with breakdowns, calculations, and fetching results.
    async analyzeColumns( environment: string, datasetSlug: string, params: z.infer<typeof ColumnAnalysisSchema>, ) { // Get column information for each requested column const columnPromises = params.columns.map(columnName => this.getColumnByName(environment, datasetSlug, columnName) ); const columns = await Promise.all(columnPromises); const query: AnalysisQuery = { calculations: [{ op: "COUNT" }], breakdowns: [...params.columns], time_range: params.timeRange || 3600, limit: 10, }; // Only add orders if we have columns if (params.columns && params.columns.length > 0) { query.orders = [ { column: params.columns[0] as string, // Force type assertion order: "descending", } ]; } // Add numeric calculations for any numeric columns const numericColumns = columns.filter( col => col.type === "integer" || col.type === "float" ); numericColumns.forEach(column => { const numericCalculations: QueryCalculation[] = [ { op: "AVG", column: column.key_name }, { op: "P95", column: column.key_name }, { op: "MAX", column: column.key_name }, { op: "MIN", column: column.key_name }, ]; if (!query.calculations) { query.calculations = []; } query.calculations.push(...numericCalculations); }); try { const results = await this.queryAndWaitForResults( environment, datasetSlug, query, ); return { data: { results: results.data?.results || [], series: results.data?.series || [], }, links: results.links, }; } catch (error) { throw new Error( `Column analysis failed: ${error instanceof Error ? error.message : "Unknown error"}`, ); } }
  • Factory function that creates the tool object with name, description, schema, and handler for registration.
    export function createAnalyzeColumnsTool(api: HoneycombAPI) { return { name: "analyze_columns", description, schema: ColumnAnalysisSchema.shape, /** * Handles the analyze_column tool request * * @param params - The parameters for the column analysis * @returns A formatted response with column analysis data */ handler: async (params: z.infer<typeof ColumnAnalysisSchema>) => { try { // Validate required parameters if (!params.environment) { throw new Error("Missing required parameter: environment"); } if (!params.dataset) { throw new Error("Missing required parameter: dataset"); } if (!params.columns || params.columns.length === 0) { throw new Error("Missing required parameter: columns"); } if (params.columns.length > 10) { throw new Error("Too many columns requested. Maximum is 10."); } // Execute the analysis via the API const result = await api.analyzeColumns(params.environment, params.dataset, params); // Initialize the response const simplifiedResponse: SimplifiedColumnAnalysis = { columns: params.columns, count: result.data?.results?.length || 0, totalEvents: 0, // Will be populated below if available }; // Add top values if we have results if (result.data?.results && result.data.results.length > 0) { const results = result.data.results as QueryResultValue[]; const firstResult = results[0]; try { // Calculate total events across all results const totalCount = results.reduce((sum, row) => { const count = row.COUNT as number | undefined; // Only add if it's a number, otherwise use 0 return sum + (typeof count === 'number' ? count : 0); }, 0); simplifiedResponse.totalEvents = totalCount; // Add top values with their counts and percentages simplifiedResponse.topValues = results.map(row => { // For multi-column analysis, combine values into a descriptive string const combinedValue = params.columns .map(col => { const colValue = row[col] !== undefined ? row[col] : null; return `${col}: ${colValue}`; }) .join(', '); const count = typeof row.COUNT === 'number' ? row.COUNT : 0; return { value: combinedValue, count, percentage: totalCount > 0 ? ((count / totalCount) * 100).toFixed(2) + '%' : '0%' }; }); // Initialize stats container for each numeric column const numericStats: Record<string, NumericStatsWithInterpretation> = {}; // Process numeric metrics for each column if available if (firstResult) { params.columns.forEach(column => { // Check if we have numeric metrics for this column const avgKey = `AVG(${column})`; if (avgKey in firstResult) { const stats: NumericStatistics = {}; // Extract metrics for this column if (typeof firstResult[avgKey] === 'number') stats.avg = firstResult[avgKey] as number; if (typeof firstResult[`P95(${column})`] === 'number') stats.p95 = firstResult[`P95(${column})`] as number; if (typeof firstResult[`MAX(${column})`] === 'number') stats.max = firstResult[`MAX(${column})`] as number; if (typeof firstResult[`MIN(${column})`] === 'number') stats.min = firstResult[`MIN(${column})`] as number; // Calculate range if we have min and max if (stats.min !== undefined && stats.max !== undefined) { stats.range = stats.max - stats.min; } // Only add if we have at least one stat if (Object.keys(stats).length > 0) { numericStats[column] = { ...stats, interpretation: generateInterpretation(stats, column) } as NumericStatsWithInterpretation; } } }); } // Add stats if we have any if (Object.keys(numericStats).length > 0) { simplifiedResponse.stats = numericStats; } // Add cardinality information (unique combinations of values) const uniqueValueCombinations = new Set(); results.forEach(row => { const combinationKey = params.columns .map(col => `${col}:${row[col] !== undefined ? row[col] : 'null'}`) .join('|'); uniqueValueCombinations.add(combinationKey); }); const uniqueCount = uniqueValueCombinations.size; simplifiedResponse.cardinality = { uniqueCount, classification: getCardinalityClassification(uniqueCount) }; } catch (processingError) { // Handle errors during result processing, but still return partial results console.error("Error processing column analysis results:", processingError); simplifiedResponse.processingError = `Error processing results: ${processingError instanceof Error ? processingError.message : String(processingError)}`; } } return { content: [ { type: "text", text: JSON.stringify(simplifiedResponse, null, 2), }, ], }; } catch (error) { return handleToolError(error, "analyze_columns"); } } }; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/honeycombio/honeycomb-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server