Skip to main content
Glama

data_cleaner

Clean and standardize Excel/CSV data by removing whitespace, fixing dates, standardizing formats, and eliminating empty rows to improve data quality.

Instructions

Batch data cleaning operations with intelligent detection of common data quality issues

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYesPath to the CSV or Excel file
operationsNoArray of cleaning operations to apply
previewNoShow preview before applying changes (default: false)
sheetNoSheet name for Excel files (optional)

Implementation Reference

  • src/index.ts:1271-1272 (registration)
    Registration of the data_cleaner tool in the main CallToolRequestHandler switch statement, dispatching to ExcelWorkflowHandler.cleanData method.
    case 'data_cleaner': return await this.excelWorkflowHandler.cleanData(toolArgs);
  • Input schema for the data_cleaner tool, defining required filePath and optional operations array (with enum of cleaning types), preview flag, and sheet.
    name: 'data_cleaner', description: 'Batch data cleaning operations with intelligent detection of common data quality issues', inputSchema: { type: 'object', properties: { filePath: { type: 'string', description: 'Path to the CSV or Excel file' }, operations: { type: 'array', items: { type: 'string', enum: [ 'trim_whitespace', 'fix_dates', 'standardize_numbers', 'remove_empty_rows', 'standardize_phone_formats', 'standardize_names', 'remove_special_chars', 'fix_currency' ] }, description: 'Array of cleaning operations to apply' }, preview: { type: 'boolean', description: 'Show preview before applying changes (default: false)' }, sheet: { type: 'string', description: 'Sheet name for Excel files (optional)' } }, required: ['filePath'] } },
  • Core handler function for data_cleaner tool. Parses args, reads file, applies specified cleaning operations using private helper methods, tracks changes, and returns JSON summary with preview option and change log.
    async cleanData(args: ToolArgs): Promise<ToolResponse> { try { const { filePath, operations = [], preview = false, sheet } = args; if (!filePath) { return { content: [{ type: 'text', text: JSON.stringify({ success: false, error: 'Missing required parameter: filePath' }, null, 2) }] }; } // Read the file const data = await readFileContent(filePath, sheet); if (data.length === 0) { return { content: [{ type: 'text', text: JSON.stringify({ success: false, error: 'File is empty or could not be read' }, null, 2) }] }; } const headers = data[0]; let workingData = data.map(row => [...row]); // Deep copy const changes: any[] = []; // Apply each cleaning operation for (const operation of operations) { switch (operation) { case 'trim_whitespace': workingData = this.trimWhitespace(workingData, changes); break; case 'remove_empty_rows': workingData = this.removeEmptyRows(workingData, changes); break; case 'standardize_phone_formats': workingData = this.standardizePhoneFormats(workingData, changes); break; case 'fix_dates': workingData = this.fixDates(workingData, changes); break; case 'standardize_names': workingData = this.standardizeNames(workingData, changes); break; case 'remove_special_chars': workingData = this.removeSpecialChars(workingData, changes); break; case 'fix_currency': workingData = this.fixCurrency(workingData, changes); break; } } const result = { success: true, operation: 'data_cleaner', preview, summary: { originalRows: data.length, cleanedRows: workingData.length, operationsApplied: operations, changesCount: changes.length }, changes: changes.slice(0, 100), // Limit to first 100 changes for readability data: preview ? workingData.slice(0, 10) : undefined // Show first 10 rows in preview }; return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] }; } catch (error) { return { content: [{ type: 'text', text: JSON.stringify({ success: false, error: error instanceof Error ? error.message : 'Unknown error', operation: 'data_cleaner' }, null, 2) }] }; } }
  • Helper method for 'trim_whitespace' operation: trims whitespace from string cells and logs changes.
    private trimWhitespace(data: any[][], changes: any[]): any[][] { return data.map((row, rowIndex) => row.map((cell, colIndex) => { if (typeof cell === 'string') { const trimmed = cell.trim(); if (trimmed !== cell) { changes.push({ operation: 'trim_whitespace', row: rowIndex, col: colIndex, before: cell, after: trimmed }); } return trimmed; } return cell; }) ); }
  • Helper method for 'remove_empty_rows' operation: filters out completely empty rows (excluding header) and logs removals.
    private removeEmptyRows(data: any[][], changes: any[]): any[][] { const result = data.filter((row, index) => { const isEmpty = row.every(cell => !cell || String(cell).trim() === ''); if (isEmpty && index > 0) { // Don't remove header changes.push({ operation: 'remove_empty_rows', row: index, action: 'removed' }); return false; } return true; }); return result; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ishayoyo/excel-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server