Skip to main content
Glama
guangxiangdebizi

XLSX Reader MCP

analyze_xlsx

Analyze Excel files to extract structure, data types, statistics, and sheet contents. Optionally preview rows and inspect column data types for deeper insights.

Instructions

Analyze Excel files to get detailed information about structure, data types, statistics, and sheet contents.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYesPath to the Excel file to analyze
includePreviewNoWhether to include a preview of the data (default: true)
previewRowsNoNumber of rows to preview (default: 5, max: 20)
analyzeDataTypesNoWhether to analyze data types in each column (default: true)

Implementation Reference

  • The main tool handler for 'analyze_xlsx' - defines the tool schema (name, description, parameters) and the run() function that reads and analyzes Excel files, producing a detailed report including sheet info, column data types, statistics, and previews.
    export const xlsxAnalyzer = {
      name: "analyze_xlsx",
      description: "Analyze Excel files to get detailed information about structure, data types, statistics, and sheet contents.",
      parameters: {
        type: "object",
        properties: {
          filePath: {
            type: "string",
            description: "Path to the Excel file to analyze"
          },
          includePreview: {
            type: "boolean",
            description: "Whether to include a preview of the data (default: true)"
          },
          previewRows: {
            type: "number",
            description: "Number of rows to preview (default: 5, max: 20)"
          },
          analyzeDataTypes: {
            type: "boolean",
            description: "Whether to analyze data types in each column (default: true)"
          }
        },
        required: ["filePath"]
      },
      
      async run(args: { 
        filePath: string; 
        includePreview?: boolean;
        previewRows?: number;
        analyzeDataTypes?: boolean;
      }) {
        try {
          // Parameter validation
          if (!args.filePath) {
            throw new Error("File path is required");
          }
    
          // Check if file exists
          if (!fs.existsSync(args.filePath)) {
            throw new Error(`File not found: ${args.filePath}`);
          }
    
          // Check file extension
          const ext = path.extname(args.filePath).toLowerCase();
          if (!['.xlsx', '.xls', '.xlsm'].includes(ext)) {
            throw new Error("File must be an Excel file (.xlsx, .xls, or .xlsm)");
          }
    
          // Get file stats
          const stats = fs.statSync(args.filePath);
          const fileSize = (stats.size / 1024).toFixed(2); // KB
    
          // Read the workbook
          const workbook = XLSX.readFile(args.filePath);
          const sheetNames = workbook.SheetNames;
          
          const includePreview = args.includePreview !== false;
          const previewRows = Math.min(args.previewRows || 5, 20); // Default 5, max 20
          const analyzeDataTypes = args.analyzeDataTypes !== false;
    
          // Analyze each sheet
          const sheetAnalysis: any[] = [];
          
          for (const sheetName of sheetNames) {
            const worksheet = workbook.Sheets[sheetName];
            const range = XLSX.utils.decode_range(worksheet['!ref'] || 'A1');
            
            // Get sheet dimensions
            const totalRows = range.e.r + 1;
            const totalCols = range.e.c + 1;
            
            // Convert to array for analysis
            let rawData = XLSX.utils.sheet_to_json(worksheet, { 
              header: 1,
              defval: ''
            }) as any[][];
            
            // Helper function to check if a row is empty
            const isEmptyRow = (row: any[]): boolean => {
              return row.every(cell => cell === '' || cell == null || cell === undefined);
            };
            
            // Get original statistics
            const originalTotalRows = rawData.length;
            
            // Filter out completely empty rows for analysis
            const data = rawData.filter((row, index) => {
              // Always keep the first row (header)
              if (index === 0) return true;
              // Filter out empty rows
              return !isEmptyRow(row);
            });
            
            const dataRows = data.length > 0 ? data.length - 1 : 0; // Exclude header
            const skippedEmptyRows = originalTotalRows - data.length;
            
            // Analyze data types if requested
            let columnAnalysis: any[] = [];
            if (analyzeDataTypes && data.length > 0) {
              const headers = data[0] || [];
              const dataRows = data.slice(1);
              
              columnAnalysis = headers.map((header: any, colIndex: number) => {
                const columnData = dataRows.map(row => row[colIndex]).filter(cell => cell !== '' && cell != null);
                
                if (columnData.length === 0) {
                  return {
                    column: colIndex + 1,
                    header: header || `Column ${colIndex + 1}`,
                    dataType: 'empty',
                    nonEmptyCount: 0,
                    emptyCount: dataRows.length,
                    uniqueValues: 0
                  };
                }
                
                // Analyze data types
                const types = {
                  number: 0,
                  string: 0,
                  date: 0,
                  boolean: 0
                };
                
                columnData.forEach(cell => {
                  if (typeof cell === 'number') {
                    types.number++;
                  } else if (typeof cell === 'boolean') {
                    types.boolean++;
                  } else if (cell instanceof Date || (typeof cell === 'string' && !isNaN(Date.parse(cell)))) {
                    types.date++;
                  } else {
                    types.string++;
                  }
                });
                
                // Determine primary type
                const primaryType = Object.entries(types).reduce((a, b) => types[a[0] as keyof typeof types] > types[b[0] as keyof typeof types] ? a : b)[0];
                
                // Get unique values count
                const uniqueValues = new Set(columnData).size;
                
                return {
                  column: colIndex + 1,
                  header: header || `Column ${colIndex + 1}`,
                  dataType: primaryType,
                  nonEmptyCount: columnData.length,
                  emptyCount: dataRows.length - columnData.length,
                  uniqueValues,
                  typeDistribution: types
                };
              });
            }
            
            // Get preview data
            let preview = '';
            if (includePreview && data.length > 0) {
              // Limit preview data and ensure we don't exceed available data
              const maxPreviewRows = Math.min(previewRows + 1, data.length, 21); // +1 for header, max 21 total
              const previewData = data.slice(0, maxPreviewRows);
              const maxCols = Math.max(...previewData.map(row => row.length));
              const colWidths = Array(maxCols).fill(0);
              
              // Calculate column widths
              previewData.forEach(row => {
                row.forEach((cell: any, index: number) => {
                  const cellStr = String(cell || '');
                  colWidths[index] = Math.max(colWidths[index], Math.min(cellStr.length, 20)); // Max width 20
                });
              });
              
              // Format preview table
              const formatRow = (row: any[]) => {
                return '| ' + row.map((cell: any, index: number) => {
                  const cellStr = String(cell || '');
                  const truncated = cellStr.length > 20 ? cellStr.substring(0, 17) + '...' : cellStr;
                  return truncated.padEnd(colWidths[index]);
                }).join(' | ') + ' |';
              };
              
              const separator = '|' + colWidths.map(width => '-'.repeat(width + 2)).join('|') + '|';
              
              preview = previewData.map((row: any[], index: number) => {
                const formattedRow = formatRow(row);
                if (index === 0) {
                  return formattedRow + '\n' + separator;
                }
                return formattedRow;
              }).join('\n');
            }
            
            sheetAnalysis.push({
              name: sheetName,
              dimensions: {
                rows: totalRows,
                columns: totalCols
              },
              originalTotalRows,
              dataRows,
              skippedEmptyRows,
              hasData: data.length > 0,
              columnAnalysis,
              preview
            });
          }
          
          // Generate comprehensive report
          const report = [
            `# 📊 Excel File Analysis Report`,
            ``,
            `## 📁 File Information`,
            `- **File:** ${path.basename(args.filePath)}`,
            `- **Full Path:** ${args.filePath}`,
            `- **Size:** ${fileSize} KB`,
            `- **Format:** ${ext.toUpperCase()}`,
            `- **Last Modified:** ${stats.mtime.toLocaleString()}`,
            ``,
            `## 📋 Workbook Overview`,
            `- **Total Sheets:** ${sheetNames.length}`,
            `- **Sheet Names:** ${sheetNames.join(', ')}`,
            ``
          ];
          
          // Add detailed sheet analysis
          sheetAnalysis.forEach((sheet, index) => {
            report.push(`## 📄 Sheet ${index + 1}: "${sheet.name}"`);
            report.push(``);
            report.push(`### Basic Information`);
            report.push(`- **Total Rows:** ${sheet.originalTotalRows} (including header)`);
            report.push(`- **Data Rows:** ${sheet.dataRows} (excluding header)`);
            if (sheet.skippedEmptyRows > 0) {
              report.push(`- **Empty Rows Filtered:** ${sheet.skippedEmptyRows}`);
            }
            report.push(`- **Columns:** ${sheet.dimensions.columns}`);
            report.push(`- **Dimensions:** ${sheet.dimensions.rows} rows × ${sheet.dimensions.columns} columns (after filtering)`);
            report.push(`- **Has Data:** ${sheet.hasData ? 'Yes' : 'No'}`);
            report.push(``);
            
            if (analyzeDataTypes && sheet.columnAnalysis.length > 0) {
              report.push(`### Column Analysis`);
              report.push(``);
              sheet.columnAnalysis.forEach((col: any) => {
                report.push(`**${col.header}** (Column ${col.column})`);
                report.push(`- Primary Type: ${col.dataType}`);
                report.push(`- Non-empty: ${col.nonEmptyCount}, Empty: ${col.emptyCount}`);
                report.push(`- Unique Values: ${col.uniqueValues}`);
                if (col.typeDistribution) {
                  const dist = Object.entries(col.typeDistribution)
                    .filter(([_, count]) => (count as number) > 0)
                    .map(([type, count]) => `${type}: ${count}`)
                    .join(', ');
                  if (dist) report.push(`- Type Distribution: ${dist}`);
                }
                report.push(``);
              });
            }
            
            if (includePreview && sheet.preview) {
              report.push(`### Data Preview (First ${previewRows} rows)`);
              report.push(``);
              report.push('```');
              report.push(sheet.preview);
              report.push('```');
              report.push(``);
            }
            
            if (index < sheetAnalysis.length - 1) {
              report.push(`---`);
              report.push(``);
            }
          });
          
          return {
            content: [{
              type: "text",
              text: report.join('\n')
            }]
          };
          
        } catch (error: any) {
          return {
            content: [{
              type: "text",
              text: `❌ **Error analyzing Excel file:** ${error.message}`
            }],
            isError: true
          };
        }
      }
    };
  • Tool registration metadata including name 'analyze_xlsx', description, and JSON Schema parameters (filePath required, optional includePreview, previewRows, analyzeDataTypes).
    export const xlsxAnalyzer = {
      name: "analyze_xlsx",
      description: "Analyze Excel files to get detailed information about structure, data types, statistics, and sheet contents.",
      parameters: {
        type: "object",
        properties: {
          filePath: {
            type: "string",
            description: "Path to the Excel file to analyze"
          },
          includePreview: {
            type: "boolean",
            description: "Whether to include a preview of the data (default: true)"
          },
          previewRows: {
            type: "number",
            description: "Number of rows to preview (default: 5, max: 20)"
          },
          analyzeDataTypes: {
            type: "boolean",
            description: "Whether to analyze data types in each column (default: true)"
          }
        },
        required: ["filePath"]
      },
  • src/index.ts:7-7 (registration)
    Import of xlsxAnalyzer from the tool module. This is how the tool is wired into the server.
    import { xlsxAnalyzer } from "./tools/xlsx-analyzer.js";
  • src/index.ts:18-33 (registration)
    Tool list registration: the 'analyze_xlsx' tool is registered in the ListToolsRequestSchema handler with its name, description, and inputSchema.
    server.setRequestHandler(ListToolsRequestSchema, async () => {
      return {
        tools: [
          {
            name: xlsxReader.name,
            description: xlsxReader.description,
            inputSchema: xlsxReader.parameters
          },
          {
            name: xlsxAnalyzer.name,
            description: xlsxAnalyzer.description,
            inputSchema: xlsxAnalyzer.parameters
          }
        ]
      };
    });
  • src/index.ts:36-45 (registration)
    Tool call dispatch: the 'analyze_xlsx' case routes to xlsxAnalyzer.run() in the CallToolRequestSchema handler.
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      switch (request.params.name) {
        case "read_xlsx":
          return await xlsxReader.run(request.params.arguments as any);
        case "analyze_xlsx":
          return await xlsxAnalyzer.run(request.params.arguments as any);
        default:
          throw new Error(`Unknown tool: ${request.params.name}`);
      }
    });
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It does not mention that the tool is read-only (non-destructive), potential performance impacts on large files, or any restrictions. The name implies analysis but lacks explicit safety guarantees.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, focused sentence that immediately conveys the tool's purpose without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description adequately summarizes the tool's functionality but does not mention output format, error handling, or file format constraints. Given the absence of output schema and annotations, it could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers 100% of parameters with descriptions; the tool description adds no additional meaning beyond the schema. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes Excel files to obtain detailed information about structure, data types, statistics, and sheet contents, distinguishing it from the sibling tool 'read_xlsx' which likely reads data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (when analysis of structure/types/statistics is needed) but does not explicitly state when not to use or name the sibling as an alternative for raw reading.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/guangxiangdebizi/xlsx-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server