Large File MCP Server

A Model Context Protocol (MCP) server for intelligent handling of large files with smart chunking, navigation, and streaming capabilities.

npm version npm downloads codecov License: MIT TypeScript Node.js MCP Documentation GitHub stars GitHub issues

📚 Full Documentation | API Reference | Examples

Features

Smart Chunking - Automatically determines optimal chunk size based on file type
Intelligent Navigation - Jump to specific lines with surrounding context
Powerful Search - Regex support with context lines before/after matches
File Analysis - Comprehensive metadata and statistical analysis
Memory Efficient - Stream files of any size without loading into memory
Performance Optimized - Built-in LRU caching for frequently accessed chunks
Type Safe - Written in TypeScript with strict typing
Cross-Platform - Works on Windows, macOS, and Linux

Installation

npm install -g @willianpinho/large-file-mcp

Or use directly with npx:

npx @willianpinho/large-file-mcp

Quick Start

Claude Code CLI

Add the MCP server using the CLI:

# Add for current project only (local scope) claude mcp add --transport stdio --scope local large-file-mcp -- npx -y @willianpinho/large-file-mcp # Add globally for all projects (user scope) claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Verify installation:

claude mcp list claude mcp get large-file-mcp

Remove if needed:

# Remove from local scope claude mcp remove large-file-mcp -s local # Remove from user scope claude mcp remove large-file-mcp -s user

MCP Scopes:

local - Available only in the current project directory
user - Available globally for all projects
project - Defined in .mcp.json for team sharing

Claude Desktop

Add to your claude_desktop_config.json:

{ "mcpServers": { "large-file": { "command": "npx", "args": ["-y", "@willianpinho/large-file-mcp"] } } }

Config file locations:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Restart Claude Desktop after editing.

Other AI Platforms

Gemini:

{ "tools": [ { "name": "large-file-mcp", "command": "npx @willianpinho/large-file-mcp", "protocol": "mcp" } ] }

Usage

Once configured, you can use natural language to interact with large files:

Read the first chunk of /var/log/system.log

Find all ERROR messages in /var/log/app.log

Show me line 1234 of /code/app.ts with context

Get the structure of /data/sales.csv

Available Tools

read_large_file_chunk

Read a specific chunk of a large file with intelligent chunking.

Parameters:

filePath (required): Absolute path to the file
chunkIndex (optional): Zero-based chunk index (default: 0)
linesPerChunk (optional): Lines per chunk (auto-detected if not provided)
includeLineNumbers (optional): Include line numbers (default: false)

Example:

{ "filePath": "/var/log/system.log", "chunkIndex": 0, "includeLineNumbers": true }

search_in_large_file

Search for patterns in large files with context.

Parameters:

filePath (required): Absolute path to the file
pattern (required): Search pattern
caseSensitive (optional): Case sensitive search (default: false)
regex (optional): Use regex pattern (default: false)
maxResults (optional): Maximum results (default: 100)
contextBefore (optional): Context lines before match (default: 2)
contextAfter (optional): Context lines after match (default: 2)

Example:

{ "filePath": "/var/log/error.log", "pattern": "ERROR.*database", "regex": true, "maxResults": 50 }

get_file_structure

Analyze file structure and get comprehensive metadata.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, recommended chunk size, and sample lines.

navigate_to_line

Jump to a specific line with surrounding context.

Parameters:

filePath (required): Absolute path to the file
lineNumber (required): Line number to navigate to (1-indexed)
contextLines (optional): Context lines before/after (default: 5)

get_file_summary

Get comprehensive statistical summary of a file.

Parameters:

filePath (required): Absolute path to the file

Returns: File metadata, line statistics, character statistics, and word count.

stream_large_file

Stream a file in chunks for processing very large files.

Parameters:

filePath (required): Absolute path to the file
chunkSize (optional): Chunk size in bytes (default: 64KB)
startOffset (optional): Starting byte offset (default: 0)
maxChunks (optional): Maximum chunks to return (default: 10)

Supported File Types

The server intelligently detects and optimizes for:

Text files (.txt) - 500 lines/chunk
Log files (.log) - 500 lines/chunk
Code files (.ts, .js, .py, .java, .cpp, .go, .rs, etc.) - 300 lines/chunk
CSV files (.csv) - 1000 lines/chunk
JSON files (.json) - 100 lines/chunk
XML files (.xml) - 200 lines/chunk
Markdown files (.md) - 500 lines/chunk
Configuration files (.yml, .yaml, .sh, .bash) - 300 lines/chunk

Configuration

Customize behavior using environment variables:

Variable	Description	Default
`CHUNK_SIZE`	Default lines per chunk	500
`OVERLAP_LINES`	Overlap between chunks	10
`MAX_FILE_SIZE`	Maximum file size in bytes	10GB
`CACHE_SIZE`	Cache size in bytes	100MB
`CACHE_TTL`	Cache TTL in milliseconds	5 minutes
`CACHE_ENABLED`	Enable/disable caching	true

Example with custom settings (Claude Desktop):

{ "mcpServers": { "large-file": { "command": "npx", "args": ["-y", "@willianpinho/large-file-mcp"], "env": { "CHUNK_SIZE": "1000", "CACHE_ENABLED": "true" } } } }

Example with custom settings (Claude Code CLI):

claude mcp add --transport stdio --scope user large-file-mcp \ --env CHUNK_SIZE=1000 \ --env CACHE_ENABLED=true \ -- npx -y @willianpinho/large-file-mcp

Examples

Analyzing Log Files

Analyze /var/log/nginx/access.log and find all 404 errors

The AI will use the search tool to find patterns and provide context around each match.

Code Navigation

Find all function definitions in /project/src/main.py

Uses regex search to locate function definitions with surrounding code context.

CSV Data Exploration

Show me the structure of /data/sales.csv

Returns metadata, line count, sample rows, and recommended chunk size.

Large File Processing

Stream the first 100MB of /data/huge_dataset.json

Uses streaming mode to handle very large files efficiently.

Performance

Caching

LRU Cache with configurable size (default 100MB)
TTL-based expiration (default 5 minutes)
80-90% hit rate for repeated access
Significant performance improvement for frequently accessed files

Memory Management

Streaming architecture - files are read line-by-line, never fully loaded
Configurable chunk sizes - adjust based on your use case
Smart buffering - minimal memory footprint for search operations

File Size Handling

File Size	Operation Time	Method
< 1MB	< 100ms	Direct read
1-100MB	< 500ms	Streaming
100MB-1GB	1-3s	Streaming + cache
> 1GB	Progressive	AsyncGenerator

Development

Building from Source

git clone https://github.com/willianpinho/large-file-mcp.git cd large-file-mcp npm install npm run build

Development Mode

npm run dev # Watch mode npm run lint # Run linter npm start # Run server

Project Structure

src/ ├── index.ts # Entry point ├── server.ts # MCP server implementation ├── fileHandler.ts # Core file handling logic ├── cacheManager.ts # Caching implementation └── types.ts # TypeScript type definitions

Troubleshooting

File not accessible

Ensure the file path is absolute and the file has read permissions:

chmod +r /path/to/file

Out of memory

Reduce CHUNK_SIZE environment variable
Disable cache with CACHE_ENABLED=false
Use stream_large_file for very large files

Slow search performance

Reduce maxResults parameter
Use startLine and endLine to limit search range
Ensure caching is enabled

Claude Code CLI: MCP server not found

Check if the server is installed:

claude mcp list

If not listed, reinstall:

claude mcp add --transport stdio --scope user large-file-mcp -- npx -y @willianpinho/large-file-mcp

Check server health:

claude mcp get large-file-mcp

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Development Workflow

Fork the repository
Create a feature branch
Make your changes
Ensure code builds and lints successfully
Submit a pull request

See CONTRIBUTING.md for detailed guidelines.

License

MIT

Support

Issues: GitHub Issues
Documentation: This README and inline code documentation
Examples: Check the examples/ directory

Acknowledgments

Built with the Model Context Protocol SDK.

Made for the AI developer community.

Large File MCP Server

Features

Installation

Quick Start

Claude Code CLI

Claude Desktop

Other AI Platforms

Usage

Available Tools

read_large_file_chunk

search_in_large_file

get_file_structure

navigate_to_line

get_file_summary

stream_large_file

Supported File Types

Configuration

Examples

Analyzing Log Files

Code Navigation

CSV Data Exploration

Large File Processing

Performance

Caching

Memory Management

File Size Handling

Development

Building from Source

Development Mode

Project Structure

Troubleshooting

File not accessible

Out of memory

Slow search performance

Claude Code CLI: MCP server not found

Contributing

Development Workflow

License

Support

Acknowledgments

Resources

New MCP Servers

Latest Blog Posts

MCP directory API