Large File MCP Server

MIT License

large-file-mcp
examples

usage-examples.md•9.19 kB

# Usage Examples for Large File MCP Server ## Example 1: Analyzing Large Log Files ### Scenario: Application Log Analysis You have a 2GB application log file and need to find all errors related to database connections. ```json // Step 1: Get file structure { "tool": "get_file_structure", "arguments": { "filePath": "/var/log/app/application.log" } } // Response shows: 5,000,000 lines, recommended chunk size: 500 lines // Step 2: Search for database errors { "tool": "search_in_large_file", "arguments": { "filePath": "/var/log/app/application.log", "pattern": "(ERROR|FATAL).*database", "regex": true, "maxResults": 100, "contextBefore": 3, "contextAfter": 3 } } // Step 3: Navigate to specific error for detailed context { "tool": "navigate_to_line", "arguments": { "filePath": "/var/log/app/application.log", "lineNumber": 2543891, "contextLines": 20 } } ``` ## Example 2: Processing Large CSV Datasets ### Scenario: Sales Data Analysis You have a 5GB CSV file with 50 million rows of sales data. ```json // Step 1: Analyze file structure { "tool": "get_file_structure", "arguments": { "filePath": "/data/sales/2024_sales.csv" } } // Step 2: Read header and first chunk { "tool": "read_large_file_chunk", "arguments": { "filePath": "/data/sales/2024_sales.csv", "chunkIndex": 0, "linesPerChunk": 1000, "includeLineNumbers": true } } // Step 3: Search for specific product { "tool": "search_in_large_file", "arguments": { "filePath": "/data/sales/2024_sales.csv", "pattern": "PROD-12345", "caseSensitive": true, "maxResults": 1000 } } // Step 4: Stream entire file for processing { "tool": "stream_large_file", "arguments": { "filePath": "/data/sales/2024_sales.csv", "chunkSize": 1048576, "maxChunks": 50 } } ``` ## Example 3: Code Navigation in Large Codebases ### Scenario: Finding Function Definitions You have a large TypeScript codebase with files over 10,000 lines. ```json // Step 1: Get file summary { "tool": "get_file_summary", "arguments": { "filePath": "/project/src/core/engine.ts" } } // Step 2: Search for all function definitions { "tool": "search_in_large_file", "arguments": { "filePath": "/project/src/core/engine.ts", "pattern": "^\\s*(export\\s+)?(async\\s+)?function\\s+\\w+", "regex": true, "maxResults": 500, "contextBefore": 1, "contextAfter": 5 } } // Step 3: Navigate to specific function { "tool": "navigate_to_line", "arguments": { "filePath": "/project/src/core/engine.ts", "lineNumber": 3456, "contextLines": 30 } } // Step 4: Search for TODOs and FIXMEs { "tool": "search_in_large_file", "arguments": { "filePath": "/project/src/core/engine.ts", "pattern": "TODO|FIXME|HACK|XXX", "regex": true, "contextBefore": 2, "contextAfter": 2 } } ``` ## Example 4: JSON Data Processing ### Scenario: Large JSON Array Processing You have a 3GB JSON file with a large array of objects. ```json // Step 1: Check file structure { "tool": "get_file_structure", "arguments": { "filePath": "/data/api_responses.json" } } // Step 2: Read in small chunks (JSON has long lines) { "tool": "read_large_file_chunk", "arguments": { "filePath": "/data/api_responses.json", "chunkIndex": 0, "linesPerChunk": 100 } } // Step 3: Search for specific JSON keys { "tool": "search_in_large_file", "arguments": { "filePath": "/data/api_responses.json", "pattern": "\"userId\":\\s*\"user-12345\"", "regex": true, "contextBefore": 5, "contextAfter": 10 } } // Step 4: Stream for complete processing { "tool": "stream_large_file", "arguments": { "filePath": "/data/api_responses.json", "chunkSize": 524288, "maxChunks": 20 } } ``` ## Example 5: System Log Monitoring ### Scenario: Real-time Error Detection Monitor recent entries in a constantly growing log file. ```json // Step 1: Get file metadata { "tool": "get_file_structure", "arguments": { "filePath": "/var/log/syslog" } } // Response: totalLines: 125000 // Step 2: Read last chunk (most recent entries) { "tool": "read_large_file_chunk", "arguments": { "filePath": "/var/log/syslog", "chunkIndex": 249, // (125000 / 500) - 1 "includeLineNumbers": true } } // Step 3: Search for errors in recent entries only { "tool": "search_in_large_file", "arguments": { "filePath": "/var/log/syslog", "pattern": "error|critical|fatal", "regex": true, "caseSensitive": false, "startLine": 120000, "endLine": 125000, "contextBefore": 3, "contextAfter": 3 } } ``` ## Example 6: Multi-File Analysis ### Scenario: Searching Across Multiple Log Files Process multiple log files from different dates. ```json // For each file, run structure analysis { "tool": "get_file_structure", "arguments": { "filePath": "/logs/app-2024-01-01.log" } } { "tool": "get_file_structure", "arguments": { "filePath": "/logs/app-2024-01-02.log" } } // Search each file for pattern { "tool": "search_in_large_file", "arguments": { "filePath": "/logs/app-2024-01-01.log", "pattern": "OutOfMemoryError", "maxResults": 50 } } { "tool": "search_in_large_file", "arguments": { "filePath": "/logs/app-2024-01-02.log", "pattern": "OutOfMemoryError", "maxResults": 50 } } // Get summary statistics for comparison { "tool": "get_file_summary", "arguments": { "filePath": "/logs/app-2024-01-01.log" } } { "tool": "get_file_summary", "arguments": { "filePath": "/logs/app-2024-01-02.log" } } ``` ## Example 7: Performance-Optimized Large File Processing ### Scenario: Processing 10GB File Efficiently ```json // Step 1: Analyze file to understand size { "tool": "get_file_structure", "arguments": { "filePath": "/data/huge_dataset.txt" } } // Response: 200,000,000 lines, 10GB // Step 2: Stream in large chunks for processing { "tool": "stream_large_file", "arguments": { "filePath": "/data/huge_dataset.txt", "chunkSize": 10485760, // 10MB chunks "maxChunks": 100, "startOffset": 0 } } // Step 3: Continue from offset (for next batch) { "tool": "stream_large_file", "arguments": { "filePath": "/data/huge_dataset.txt", "chunkSize": 10485760, "maxChunks": 100, "startOffset": 1048576000 // 1GB offset } } // Step 4: If looking for specific data, use targeted search { "tool": "search_in_large_file", "arguments": { "filePath": "/data/huge_dataset.txt", "pattern": "critical_marker", "maxResults": 10, "startLine": 1000000, "endLine": 2000000 } } ``` ## Example 8: Code Review Workflow ### Scenario: Reviewing Large PR Changes ```json // Step 1: Get file overview { "tool": "get_file_summary", "arguments": { "filePath": "/project/src/components/Dashboard.tsx" } } // Step 2: Find all React components { "tool": "search_in_large_file", "arguments": { "filePath": "/project/src/components/Dashboard.tsx", "pattern": "^(export\\s+)?(const|function)\\s+\\w+.*=.*React", "regex": true, "contextBefore": 2, "contextAfter": 10 } } // Step 3: Find all useEffect hooks { "tool": "search_in_large_file", "arguments": { "filePath": "/project/src/components/Dashboard.tsx", "pattern": "useEffect\\(", "contextBefore": 3, "contextAfter": 15 } } // Step 4: Navigate to specific component { "tool": "navigate_to_line", "arguments": { "filePath": "/project/src/components/Dashboard.tsx", "lineNumber": 567, "contextLines": 25 } } ``` ## Best Practices ### 1. Start with File Structure Always begin by calling `get_file_structure` to understand: - File size - Total lines - Recommended chunk size - File type ### 2. Use Appropriate Chunk Sizes - **Small files (<1000 lines)**: Use default or smaller chunks - **Medium files (1K-100K lines)**: Use recommended chunk size - **Large files (>100K lines)**: Use larger chunks or streaming ### 3. Optimize Search Queries - Use `startLine` and `endLine` to limit search range - Set reasonable `maxResults` to avoid overwhelming output - Use `regex: false` for simple string searches (faster) - Adjust `contextBefore` and `contextAfter` based on needs ### 4. Leverage Caching - Enable caching for frequently accessed files - Adjust `CACHE_SIZE` based on available memory - Use appropriate `CACHE_TTL` for your use case ### 5. Handle Very Large Files - Use `stream_large_file` for files >1GB - Process in batches using `startOffset` and `maxBytes` - Consider `maxChunks` to limit memory usage ### 6. Error Handling Always check for errors in responses: ```json { "error": "File not accessible: /path/to/file", "code": "TOOL_EXECUTION_ERROR" } ``` ## Performance Tips 1. **Cache Warm-up**: Read file structure first to warm cache 2. **Batch Operations**: Group related operations together 3. **Incremental Processing**: Use offsets for very large files 4. **Limit Results**: Use `maxResults` to prevent memory issues 5. **Context Balance**: Don't use excessive `contextBefore`/`contextAfter`

Latest Blog Posts

OpenTelemetry for Model Context Protocol (MCP) Analytics and Agent Observability
By Om-Shree-0709 on .
observability
mcp
opentelemetry
Securing Enterprise AI Agents with Unique Identities in the Model Context Protocol (MCP)
By Om-Shree-0709 on .
When Your Year of Work Gets Copied Overnight: What Actually Matters?
By punkpeye on .
startups

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/willianpinho/large-file-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server