Skip to main content
Glama
honeycombio
by honeycombio

analyze_columns

Analyze dataset columns to reveal statistical insights, value distributions, and numeric metrics for up to 10 columns simultaneously.

Instructions

Analyzes specific columns in a dataset by running statistical queries and returning computed metrics. This tool allows users to get statistical information about a specific column, including value distribution, top values, and numeric statistics (for numeric columns). Supports analyzing up to 10 columns at once by specifying an array of column names in the 'columns' parameter. When multiple columns are specified, they will be analyzed together as a group, showing the distribution of their combined values. Use this tool before running queries to get a better understanding of the data in your dataset.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
environmentYesThe Honeycomb environment containing the dataset
datasetYesThe dataset containing the column to analyze
columnsYesThe names of the columns to analyze
timeRangeNoTime range in seconds to analyze. Default is 2 hours.

Implementation Reference

  • The main handler function for the 'analyze_columns' tool. It validates inputs, calls the API to perform column analysis, processes the results to compute statistics, top values, cardinality, and returns a formatted JSON response.
    handler: async (params: z.infer<typeof ColumnAnalysisSchema>) => {
      try {
        // Validate required parameters
        if (!params.environment) {
          throw new Error("Missing required parameter: environment");
        }
        if (!params.dataset) {
          throw new Error("Missing required parameter: dataset");
        }
        if (!params.columns || params.columns.length === 0) {
          throw new Error("Missing required parameter: columns");
        }
        if (params.columns.length > 10) {
          throw new Error("Too many columns requested. Maximum is 10.");
        }
        
        // Execute the analysis via the API
        const result = await api.analyzeColumns(params.environment, params.dataset, params);
        
        // Initialize the response
        const simplifiedResponse: SimplifiedColumnAnalysis = {
          columns: params.columns,
          count: result.data?.results?.length || 0,
          totalEvents: 0,  // Will be populated below if available
        };
        
        // Add top values if we have results
        if (result.data?.results && result.data.results.length > 0) {
          const results = result.data.results as QueryResultValue[];
          const firstResult = results[0];
          
          try {
            // Calculate total events across all results
            const totalCount = results.reduce((sum, row) => {
              const count = row.COUNT as number | undefined;
              // Only add if it's a number, otherwise use 0
              return sum + (typeof count === 'number' ? count : 0);
            }, 0);
            simplifiedResponse.totalEvents = totalCount;
            
            // Add top values with their counts and percentages
            simplifiedResponse.topValues = results.map(row => {
              // For multi-column analysis, combine values into a descriptive string
              const combinedValue = params.columns
                .map(col => {
                  const colValue = row[col] !== undefined ? row[col] : null;
                  return `${col}: ${colValue}`;
                })
                .join(', ');
              
              const count = typeof row.COUNT === 'number' ? row.COUNT : 0;
              
              return {
                value: combinedValue,
                count,
                percentage: totalCount > 0 ? 
                  ((count / totalCount) * 100).toFixed(2) + '%' : 
                  '0%'
              };
            });
            
            // Initialize stats container for each numeric column
            const numericStats: Record<string, NumericStatsWithInterpretation> = {};
            
            // Process numeric metrics for each column if available
            if (firstResult) {
              params.columns.forEach(column => {
                // Check if we have numeric metrics for this column
                const avgKey = `AVG(${column})`;
                if (avgKey in firstResult) {
                  const stats: NumericStatistics = {};
                  
                  // Extract metrics for this column
                  if (typeof firstResult[avgKey] === 'number') stats.avg = firstResult[avgKey] as number;
                  if (typeof firstResult[`P95(${column})`] === 'number') stats.p95 = firstResult[`P95(${column})`] as number;
                  if (typeof firstResult[`MAX(${column})`] === 'number') stats.max = firstResult[`MAX(${column})`] as number;
                  if (typeof firstResult[`MIN(${column})`] === 'number') stats.min = firstResult[`MIN(${column})`] as number;
                  
                  // Calculate range if we have min and max
                  if (stats.min !== undefined && stats.max !== undefined) {
                    stats.range = stats.max - stats.min;
                  }
                  
                  // Only add if we have at least one stat
                  if (Object.keys(stats).length > 0) {
                    numericStats[column] = {
                      ...stats,
                      interpretation: generateInterpretation(stats, column)
                    } as NumericStatsWithInterpretation;
                  }
                }
              });
            }
            
            // Add stats if we have any
            if (Object.keys(numericStats).length > 0) {
              simplifiedResponse.stats = numericStats;
            }
            
            // Add cardinality information (unique combinations of values)
            const uniqueValueCombinations = new Set();
            
            results.forEach(row => {
              const combinationKey = params.columns
                .map(col => `${col}:${row[col] !== undefined ? row[col] : 'null'}`)
                .join('|');
              uniqueValueCombinations.add(combinationKey);
            });
            
            const uniqueCount = uniqueValueCombinations.size;
            
            simplifiedResponse.cardinality = {
              uniqueCount,
              classification: getCardinalityClassification(uniqueCount)
            };
          } catch (processingError) {
            // Handle errors during result processing, but still return partial results
            console.error("Error processing column analysis results:", processingError);
            simplifiedResponse.processingError = `Error processing results: ${processingError instanceof Error ? processingError.message : String(processingError)}`;
          }
        }
        
        return {
          content: [
            {
              type: "text",
              text: JSON.stringify(simplifiedResponse, null, 2),
            },
          ],
        };
      } catch (error) {
        return handleToolError(error, "analyze_columns");
      }
    }
  • Zod schema defining the input parameters for the analyze_columns tool: environment, dataset, columns (array 1-10), optional timeRange.
    export const ColumnAnalysisSchema = z.object({
      environment: z.string().min(1).trim().describe("The Honeycomb environment containing the dataset"),
      dataset: z.string().min(1).trim().describe("The dataset containing the column to analyze"),
      columns: z.array(z.string()).min(1).max(10).describe("The names of the columns to analyze"),
      timeRange: z.number().positive().optional().describe("Time range in seconds to analyze. Default is 2 hours."),
    });
  • The registerTools function imports and creates the analyze_columns tool (line 33), adds it to the tools array, and registers it with the MCP server in the loop, including specific validation for analyze_columns.
    export function registerTools(server: McpServer, api: HoneycombAPI) {
      const tools = [
        // Dataset tools
        createListDatasetsTool(api),
        createListColumnsTool(api),
    
        // Query tools
        createRunQueryTool(api),
        createAnalyzeColumnsTool(api),
    
        // Board tools
        createListBoardsTool(api),
        createGetBoardTool(api),
    
        // Marker tools
        createListMarkersTool(api),
    
        // Recipient tools
        createListRecipientsTool(api),
    
        // SLO tools
        createListSLOsTool(api),
        createGetSLOTool(api),
    
        // Trigger tools
        createListTriggersTool(api),
        createGetTriggerTool(api),
        
        // Trace tools
        createTraceDeepLinkTool(api),
        
        // Instrumentation tools
        createInstrumentationGuidanceTool(api)
      ];
    
      // Register each tool with the server
      for (const tool of tools) {
        // Register the tool with the server using type assertion to bypass TypeScript's strict type checking
        (server as any).tool(
          tool.name,
          tool.description,
          tool.schema, 
          async (args: Record<string, any>, extra: any) => {
            try {
              // Validate and ensure required fields are present before passing to handler
              if (tool.name.includes("analyze_columns") && (!args.environment || !args.dataset || !args.columns)) {
                throw new Error("Missing required fields: environment, dataset, and columns are required");
              } else if (tool.name.includes("run_query") && (!args.environment || !args.dataset)) {
                throw new Error("Missing required fields: environment and dataset are required");
              }
              
              // Use type assertion to satisfy TypeScript's type checking
              const result = await tool.handler(args as any);
              
              // If the result already has the expected format, return it directly
              if (result && typeof result === 'object' && 'content' in result) {
                return result as any;
              }
              
              // Otherwise, format the result as expected by the SDK
              return {
                content: [
                  {
                    type: "text",
                    text: typeof result === 'string' ? result : JSON.stringify(result, null, 2),
                  },
                ],
              } as any;
            } catch (error) {
              // Format errors to match the SDK's expected format
              return {
                content: [
                  {
                    type: "text",
                    text: error instanceof Error ? error.message : String(error),
                  },
                ],
                isError: true,
              } as any;
            }
          }
        );
      }
    }
  • HoneycombAPI method called by the tool handler to execute the actual column analysis query on the Honeycomb API, building the AnalysisQuery with breakdowns, calculations, and fetching results.
    async analyzeColumns(
      environment: string,
      datasetSlug: string,
      params: z.infer<typeof ColumnAnalysisSchema>,
    ) {
      // Get column information for each requested column
      const columnPromises = params.columns.map(columnName => 
        this.getColumnByName(environment, datasetSlug, columnName)
      );
      
      const columns = await Promise.all(columnPromises);
      
      const query: AnalysisQuery = {
        calculations: [{ op: "COUNT" }],
        breakdowns: [...params.columns],
        time_range: params.timeRange || 3600,
        limit: 10,
      };
      
      // Only add orders if we have columns
      if (params.columns && params.columns.length > 0) {
        query.orders = [
          {
            column: params.columns[0] as string, // Force type assertion
            order: "descending",
          }
        ];
      }
    
      // Add numeric calculations for any numeric columns
      const numericColumns = columns.filter(
        col => col.type === "integer" || col.type === "float"
      );
      
      numericColumns.forEach(column => {
        const numericCalculations: QueryCalculation[] = [
          { op: "AVG", column: column.key_name },
          { op: "P95", column: column.key_name },
          { op: "MAX", column: column.key_name },
          { op: "MIN", column: column.key_name },
        ];
        
        if (!query.calculations) {
          query.calculations = [];
        }
        query.calculations.push(...numericCalculations);
      });
    
      try {
        const results = await this.queryAndWaitForResults(
          environment,
          datasetSlug,
          query,
        );
        return {
          data: {
            results: results.data?.results || [],
            series: results.data?.series || [],
          },
          links: results.links,
        };
      } catch (error) {
        throw new Error(
          `Column analysis failed: ${error instanceof Error ? error.message : "Unknown error"}`,
        );
      }
    }
  • Factory function that creates the tool object with name, description, schema, and handler for registration.
    export function createAnalyzeColumnsTool(api: HoneycombAPI) {
      return {
        name: "analyze_columns",
        description,
        schema: ColumnAnalysisSchema.shape,
        /**
         * Handles the analyze_column tool request
         * 
         * @param params - The parameters for the column analysis
         * @returns A formatted response with column analysis data
         */
        handler: async (params: z.infer<typeof ColumnAnalysisSchema>) => {
          try {
            // Validate required parameters
            if (!params.environment) {
              throw new Error("Missing required parameter: environment");
            }
            if (!params.dataset) {
              throw new Error("Missing required parameter: dataset");
            }
            if (!params.columns || params.columns.length === 0) {
              throw new Error("Missing required parameter: columns");
            }
            if (params.columns.length > 10) {
              throw new Error("Too many columns requested. Maximum is 10.");
            }
            
            // Execute the analysis via the API
            const result = await api.analyzeColumns(params.environment, params.dataset, params);
            
            // Initialize the response
            const simplifiedResponse: SimplifiedColumnAnalysis = {
              columns: params.columns,
              count: result.data?.results?.length || 0,
              totalEvents: 0,  // Will be populated below if available
            };
            
            // Add top values if we have results
            if (result.data?.results && result.data.results.length > 0) {
              const results = result.data.results as QueryResultValue[];
              const firstResult = results[0];
              
              try {
                // Calculate total events across all results
                const totalCount = results.reduce((sum, row) => {
                  const count = row.COUNT as number | undefined;
                  // Only add if it's a number, otherwise use 0
                  return sum + (typeof count === 'number' ? count : 0);
                }, 0);
                simplifiedResponse.totalEvents = totalCount;
                
                // Add top values with their counts and percentages
                simplifiedResponse.topValues = results.map(row => {
                  // For multi-column analysis, combine values into a descriptive string
                  const combinedValue = params.columns
                    .map(col => {
                      const colValue = row[col] !== undefined ? row[col] : null;
                      return `${col}: ${colValue}`;
                    })
                    .join(', ');
                  
                  const count = typeof row.COUNT === 'number' ? row.COUNT : 0;
                  
                  return {
                    value: combinedValue,
                    count,
                    percentage: totalCount > 0 ? 
                      ((count / totalCount) * 100).toFixed(2) + '%' : 
                      '0%'
                  };
                });
                
                // Initialize stats container for each numeric column
                const numericStats: Record<string, NumericStatsWithInterpretation> = {};
                
                // Process numeric metrics for each column if available
                if (firstResult) {
                  params.columns.forEach(column => {
                    // Check if we have numeric metrics for this column
                    const avgKey = `AVG(${column})`;
                    if (avgKey in firstResult) {
                      const stats: NumericStatistics = {};
                      
                      // Extract metrics for this column
                      if (typeof firstResult[avgKey] === 'number') stats.avg = firstResult[avgKey] as number;
                      if (typeof firstResult[`P95(${column})`] === 'number') stats.p95 = firstResult[`P95(${column})`] as number;
                      if (typeof firstResult[`MAX(${column})`] === 'number') stats.max = firstResult[`MAX(${column})`] as number;
                      if (typeof firstResult[`MIN(${column})`] === 'number') stats.min = firstResult[`MIN(${column})`] as number;
                      
                      // Calculate range if we have min and max
                      if (stats.min !== undefined && stats.max !== undefined) {
                        stats.range = stats.max - stats.min;
                      }
                      
                      // Only add if we have at least one stat
                      if (Object.keys(stats).length > 0) {
                        numericStats[column] = {
                          ...stats,
                          interpretation: generateInterpretation(stats, column)
                        } as NumericStatsWithInterpretation;
                      }
                    }
                  });
                }
                
                // Add stats if we have any
                if (Object.keys(numericStats).length > 0) {
                  simplifiedResponse.stats = numericStats;
                }
                
                // Add cardinality information (unique combinations of values)
                const uniqueValueCombinations = new Set();
                
                results.forEach(row => {
                  const combinationKey = params.columns
                    .map(col => `${col}:${row[col] !== undefined ? row[col] : 'null'}`)
                    .join('|');
                  uniqueValueCombinations.add(combinationKey);
                });
                
                const uniqueCount = uniqueValueCombinations.size;
                
                simplifiedResponse.cardinality = {
                  uniqueCount,
                  classification: getCardinalityClassification(uniqueCount)
                };
              } catch (processingError) {
                // Handle errors during result processing, but still return partial results
                console.error("Error processing column analysis results:", processingError);
                simplifiedResponse.processingError = `Error processing results: ${processingError instanceof Error ? processingError.message : String(processingError)}`;
              }
            }
            
            return {
              content: [
                {
                  type: "text",
                  text: JSON.stringify(simplifiedResponse, null, 2),
                },
              ],
            };
          } catch (error) {
            return handleToolError(error, "analyze_columns");
          }
        }
      };
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/honeycombio/honeycomb-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server