Large File MCP Server
A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities.
Features
Smart Chunking - Automatically determines optimal chunk size based on file type
Intelligent Navigation - Jump to specific lines with surrounding context
Powerful Search - Regex support with context lines before/after matches
File Analysis - Comprehensive metadata and statistical analysis
Memory Efficient - Stream files of any size without loading into memory
Performance Optimized - Built-in LRU caching for frequently accessed chunks
Type Safe - Written in TypeScript with strict typing
Cross-Platform - Works on Windows, macOS, and Linux
Installation
Or use directly with npx:
Quick Start
Claude Code CLI
Add the MCP server using the CLI:
Verify installation:
Remove if needed:
MCP Scopes:
local- Available only in the current project directoryuser- Available globally for all projectsproject- Defined in.mcp.jsonfor team sharing
Claude Desktop
Add to your claude_desktop_config.json:
Config file locations:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
Restart Claude Desktop after editing.
Other AI Platforms
Gemini:
Usage
Once configured, you can use natural language to interact with large files:
Available Tools
read_large_file_chunk
Read a specific chunk of a large file with intelligent chunking.
Parameters:
filePath(required): Absolute path to the filechunkIndex(optional): Zero-based chunk index (default: 0)linesPerChunk(optional): Lines per chunk (auto-detected if not provided)includeLineNumbers(optional): Include line numbers (default: false)
Example:
search_in_large_file
Search for patterns in large files with context.
Parameters:
filePath(required): Absolute path to the filepattern(required): Search patterncaseSensitive(optional): Case sensitive search (default: false)regex(optional): Use regex pattern (default: false)maxResults(optional): Maximum results (default: 100)contextBefore(optional): Context lines before match (default: 2)contextAfter(optional): Context lines after match (default: 2)
Example:
get_file_structure
Analyze file structure and get comprehensive metadata.
Parameters:
filePath(required): Absolute path to the file
Returns: File metadata, line statistics, recommended chunk size, and sample lines.
navigate_to_line
Jump to a specific line with surrounding context.
Parameters:
filePath(required): Absolute path to the filelineNumber(required): Line number to navigate to (1-indexed)contextLines(optional): Context lines before/after (default: 5)
get_file_summary
Get comprehensive statistical summary of a file.
Parameters:
filePath(required): Absolute path to the file
Returns: File metadata, line statistics, character statistics, and word count.
stream_large_file
Stream a file in chunks for processing very large files.
Parameters:
filePath(required): Absolute path to the filechunkSize(optional): Chunk size in bytes (default: 64KB)startOffset(optional): Starting byte offset (default: 0)maxChunks(optional): Maximum chunks to return (default: 10)
Supported File Types
The server intelligently detects and optimizes for:
Text files (.txt) - 500 lines/chunk
Log files (.log) - 500 lines/chunk
Code files (.ts, .js, .py, .java, .cpp, .go, .rs, etc.) - 300 lines/chunk
CSV files (.csv) - 1000 lines/chunk
JSON files (.json) - 100 lines/chunk
XML files (.xml) - 200 lines/chunk
Markdown files (.md) - 500 lines/chunk
Configuration files (.yml, .yaml, .sh, .bash) - 300 lines/chunk
Configuration
Customize behavior using environment variables:
Variable | Description | Default |
| Default lines per chunk | 500 |
| Overlap between chunks | 10 |
| Maximum file size in bytes | 10GB |
| Cache size in bytes | 100MB |
| Cache TTL in milliseconds | 5 minutes |
| Enable/disable caching | true |
Example with custom settings (Claude Desktop):
Example with custom settings (Claude Code CLI):
Examples
Analyzing Log Files
The AI will use the search tool to find patterns and provide context around each match.
Code Navigation
Uses regex search to locate function definitions with surrounding code context.
CSV Data Exploration
Returns metadata, line count, sample rows, and recommended chunk size.
Large File Processing
Uses streaming mode to handle very large files efficiently.
Performance
Caching
LRU Cache with configurable size (default 100MB)
TTL-based expiration (default 5 minutes)
80-90% hit rate for repeated access
Significant performance improvement for frequently accessed files
Memory Management
Streaming architecture - files are read line-by-line, never fully loaded
Configurable chunk sizes - adjust based on your use case
Smart buffering - minimal memory footprint for search operations
File Size Handling
File Size | Operation Time | Method |
< 1MB | < 100ms | Direct read |
1-100MB | < 500ms | Streaming |
100MB-1GB | 1-3s | Streaming + cache |
> 1GB | Progressive | AsyncGenerator |
Development
Building from Source
Development Mode
Project Structure
Troubleshooting
File not accessible
Ensure the file path is absolute and the file has read permissions:
Out of memory
Reduce
CHUNK_SIZEenvironment variableDisable cache with
CACHE_ENABLED=falseUse
stream_large_filefor very large files
Slow search performance
Reduce
maxResultsparameterUse
startLineandendLineto limit search rangeEnsure caching is enabled
Claude Code CLI: MCP server not found
Check if the server is installed:
If not listed, reinstall:
Check server health:
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
Development Workflow
Fork the repository
Create a feature branch
Make your changes
Ensure code builds and lints successfully
Submit a pull request
See CONTRIBUTING.md for detailed guidelines.
License
MIT
Support
Issues: GitHub Issues
Documentation: This README and inline code documentation
Examples: Check the
examples/directory
Acknowledgments
Built with the Model Context Protocol SDK.
Made for the AI developer community.
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables intelligent handling of large files through smart chunking, search with regex support, line navigation, and streaming capabilities without loading entire files into memory.