Skip to main content
Glama
structure.md•8.92 kB
# get_file_structure Analyze file structure and retrieve comprehensive metadata and statistics. ## Overview The `get_file_structure` tool provides detailed information about a file's structure, including line count, size, encoding, and statistical analysis. This is essential for understanding file characteristics before processing. ## Usage ```json { "tool": "get_file_structure", "arguments": { "filePath": "/data/large-dataset.csv" } } ``` ## Parameters | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `filePath` | string | Yes | - | Absolute or relative path to the file | ## Response Format ```typescript { filePath: string; // Absolute path to file fileName: string; // File name without path fileSize: number; // Size in bytes totalLines: number; // Total line count encoding: string; // Detected encoding (e.g., 'utf8') detectedType: string; // File type (e.g., 'log', 'csv', 'json') chunkSize: number; // Recommended lines per chunk totalChunks: number; // Total number of chunks created: Date; // File creation date modified: Date; // Last modified date statistics: { avgLineLength: number; // Average characters per line maxLineLength: number; // Longest line length minLineLength: number; // Shortest line length emptyLines: number; // Count of empty lines }; } ``` ## Examples ### Basic File Analysis Analyze a log file: ```json { "tool": "get_file_structure", "arguments": { "filePath": "/var/log/system.log" } } ``` Response: ```json { "filePath": "/var/log/system.log", "fileName": "system.log", "fileSize": 10485760, "totalLines": 25000, "encoding": "utf8", "detectedType": "log", "chunkSize": 500, "totalChunks": 50, "created": "2024-01-01T00:00:00.000Z", "modified": "2024-01-10T15:30:00.000Z", "statistics": { "avgLineLength": 120, "maxLineLength": 512, "minLineLength": 45, "emptyLines": 150 } } ``` ### CSV File Structure Analyze a CSV dataset: ```json { "tool": "get_file_structure", "arguments": { "filePath": "/data/transactions.csv" } } ``` Response: ```json { "filePath": "/data/transactions.csv", "fileName": "transactions.csv", "fileSize": 52428800, "totalLines": 500000, "encoding": "utf8", "detectedType": "csv", "chunkSize": 1000, "totalChunks": 500, "created": "2024-01-05T08:00:00.000Z", "modified": "2024-01-10T14:00:00.000Z", "statistics": { "avgLineLength": 105, "maxLineLength": 250, "minLineLength": 95, "emptyLines": 0 } } ``` ### Code File Analysis Analyze a TypeScript file: ```json { "tool": "get_file_structure", "arguments": { "filePath": "/code/app.ts" } } ``` Response: ```json { "filePath": "/code/app.ts", "fileName": "app.ts", "fileSize": 65536, "totalLines": 1250, "encoding": "utf8", "detectedType": "code", "chunkSize": 300, "totalChunks": 5, "created": "2024-01-01T10:00:00.000Z", "modified": "2024-01-10T16:45:00.000Z", "statistics": { "avgLineLength": 52, "maxLineLength": 180, "minLineLength": 0, "emptyLines": 85 } } ``` ## File Type Detection The tool automatically detects file type based on extension: | Extension | Detected Type | Chunk Size | Typical Use Case | |-----------|--------------|------------|------------------| | .txt | text | 500 | Plain text files | | .log | log | 500 | Application logs | | .csv | csv | 1000 | Data exports | | .json | json | 100 | Configuration, API data | | .xml | xml | 200 | Structured data | | .md | markdown | 500 | Documentation | | .ts, .js, .py, .java | code | 300 | Source code | | .yml, .yaml | config | 300 | Configuration | | .sql | sql | 300 | Database scripts | | .sh, .bash | shell | 300 | Shell scripts | ## Use Cases ### 1. Pre-Processing Assessment Determine optimal processing strategy: ```typescript const structure = await get_file_structure({ filePath: "/data/large.csv" }); if (structure.fileSize > 100_000_000) { // Use streaming approach console.log("Large file detected, using streaming"); } else { // Can load into memory console.log("Small file, loading directly"); } console.log(`Will process in ${structure.totalChunks} chunks`); ``` ### 2. Resource Planning Calculate processing time and memory requirements: ```typescript const structure = await get_file_structure({ filePath: "/logs/app.log" }); const estimatedMemory = structure.statistics.avgLineLength * structure.chunkSize; const estimatedTime = structure.totalChunks * 100; // 100ms per chunk console.log(`Memory per chunk: ${estimatedMemory} bytes`); console.log(`Estimated processing time: ${estimatedTime}ms`); ``` ### 3. Data Quality Check Identify potential issues: ```typescript const structure = await get_file_structure({ filePath: "/data/import.csv" }); // Check for unusual line lengths if (structure.statistics.maxLineLength > 10000) { console.warn("Unusually long lines detected"); } // Check for empty lines const emptyLinePercent = (structure.statistics.emptyLines / structure.totalLines) * 100; if (emptyLinePercent > 10) { console.warn(`${emptyLinePercent}% empty lines`); } ``` ### 4. File Comparison Compare multiple files: ```typescript const file1 = await get_file_structure({ filePath: "/data/old.csv" }); const file2 = await get_file_structure({ filePath: "/data/new.csv" }); console.log(`Line difference: ${file2.totalLines - file1.totalLines}`); console.log(`Size difference: ${file2.fileSize - file1.fileSize} bytes`); ``` ### 5. Archive Decision Determine if file should be archived: ```typescript const structure = await get_file_structure({ filePath: "/logs/old.log" }); const daysSinceModified = (Date.now() - structure.modified.getTime()) / (1000 * 60 * 60 * 24); if (daysSinceModified > 30 && structure.fileSize > 10_000_000) { console.log("Consider archiving this file"); } ``` ## Statistics Interpretation ### Average Line Length Indicates file structure: - **< 50 chars**: Likely structured data or code - **50-200 chars**: Normal text/logs - **> 200 chars**: Verbose logs or JSON ### Max Line Length Warns about potential issues: - **> 1000 chars**: May cause performance issues - **> 10000 chars**: Consider pre-processing ### Empty Lines Indicates formatting: - **0%**: Dense data files (CSV, JSON) - **5-10%**: Normal code/text - **> 20%**: Sparse formatting or issues ## Performance | File Size | Analysis Time | Memory Usage | Notes | |-----------|--------------|--------------|-------| | < 1MB | < 50ms | Minimal | Full scan | | 1-10MB | 50-200ms | < 10MB | Streaming | | 10-100MB | 200-1000ms | < 50MB | Line counting | | 100MB-1GB | 1-5s | < 100MB | Optimized scan | | > 1GB | 5-30s | < 200MB | Progressive | ## Error Handling ### File Not Found ```json { "error": "File not found: /path/to/file.csv", "code": "ENOENT" } ``` ### Permission Denied ```json { "error": "Permission denied: /root/protected.log", "code": "EACCES" } ``` ### Unsupported File Type ```json { "error": "Binary file not supported: /data/image.png", "code": "UNSUPPORTED_TYPE" } ``` ## Best Practices ### 1. Always Analyze Before Processing Check file structure before heavy operations: ```typescript // Good: Analyze first const structure = await get_file_structure({ filePath }); console.log(`Processing ${structure.totalChunks} chunks`); // Then process for (let i = 0; i < structure.totalChunks; i++) { await process_chunk(filePath, i); } ``` ### 2. Cache Structure Information Structure rarely changes, cache it: ```typescript const structureCache = new Map(); async function getStructureCached(filePath) { if (!structureCache.has(filePath)) { const structure = await get_file_structure({ filePath }); structureCache.set(filePath, structure); } return structureCache.get(filePath); } ``` ### 3. Validate File Size Check size before processing: ```typescript const structure = await get_file_structure({ filePath }); if (structure.fileSize > MAX_FILE_SIZE) { throw new Error(`File too large: ${structure.fileSize} bytes`); } ``` ### 4. Use Statistics for Optimization Adapt chunk size based on line length: ```typescript const structure = await get_file_structure({ filePath }); const optimalChunkSize = structure.statistics.avgLineLength < 100 ? 1000 // Small lines, larger chunks : 300; // Large lines, smaller chunks ``` ## See Also - [Tools Overview](/api/reference) - All available tools - [read_large_file_chunk](/api/read-chunk) - Read file chunks - [search_in_large_file](/api/search) - Search within files - [get_file_summary](/api/summary) - Quick file overview - [Performance Guide](/guide/performance) - Optimization tips - [Best Practices](/guide/best-practices) - Usage recommendations

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/willianpinho/large-file-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server