local-explorer-mcp

Research Project: Making Local File System Search Smarter for AI Agents

An experimental MCP (Model Context Protocol) server that explores how AI assistants can intelligently search local file systems using native Unix/Linux commands. This research project aims to improve speed, token efficiency, and security with agent-optimized workflows inspired by octocode-mcp.

npm version License: MIT

🎯 Built-in AI Prompts

Two powerful prompts to auto-generate comprehensive documentation for your codebase:

📐 `generate_architecture_markdown` - Create ARCHITECTURE.md

Automatically generates a complete architecture document by intelligently exploring your codebase:

🔍 Identifies project type, language, and scale
🗺️ Maps entry points, core components, and system boundaries
📊 Discovers architectural patterns and key abstractions
⚡ Token-efficient - uses pagination and discovery-first approach
🎯 Comprehensive - covers structure, invariants, cross-cutting concerns, and ADRs

Perfect for: New contributors, architecture reviews, documentation generation, understanding complex codebases

🤖 `generate_agents_markdown` - Create AGENTS.md

Automatically generates AI-friendly guidance for coding assistants by analyzing your project:

✅ Discovers existing agent configs (CLAUDE.md, .cursorrules, etc.) and merges them
🔒 Maps permissions - which files AI can edit, ask first, or never touch
📋 Extracts setup commands, testing workflows, and style guidelines
🎨 Documents code style, commit format, and PR requirements
📝 Creates comprehensive yet scannable guidelines with visual indicators (✅ ⚠️ 🚫)

Perfect for: AI-assisted development, team onboarding, consistent coding practices, project automation

How to Use

Just ask your AI assistant:

"Use the generate_architecture_markdown prompt to create an ARCHITECTURE.md for this project"
"Use the generate_agents_markdown prompt to create an AGENTS.md for this project"

The prompts work with all the tools below to intelligently explore your codebase, extract key information, and generate comprehensive, accurate documentation automatically!

Why local-explorer-mcp?

Faster codebase research with fewer tokens compared to traditional file reading approaches.

Native Performance: Leverages Unix commands (ripgrep, find, ls) for fast searches
Token Optimized: Automatic pagination, minification, and smart chunking reduces token usage
Security First: Multi-layer validation prevents path traversal, command injection, and sensitive file access
Agent Optimized: Purpose-built workflows with decision trees and hints for AI assistants
Bulk Operations: Process multiple queries in parallel for efficiency gains
Flexible & Extensible: Use these tools in your own MCP server for any use case!

📦 Installation

For Claude Desktop / Claude Code

Quick install with Claude CLI:

claude mcp add -s user local-explorer-mcp npx 'local-explorer-mcp@latest'

Or manually add to your MCP settings configuration file with:

Command: npx
Args: local-explorer-mcp@latest
Environment variable WORKSPACE_ROOT: Set to your project directory path

Requirements

Node.js: >=18.0.0
ripgrep (required for local_ripgrep tool): https://github.com/BurntSushi/ripgrep
- ✅ Already available in Claude Code and Cursor (no installation needed!)
- macOS: brew install ripgrep
- Ubuntu/Debian: apt-get install ripgrep
- Windows: choco install ripgrep
- Other platforms: See ripgrep installation guide

🚀 Quick Start

Note: This is a research project exploring better ways to search local file systems with AI agents. The tools and patterns are flexible and can be adapted for your own use cases!

Once installed, AI assistants can use these tools automatically to:

Explore directory structure - See what files and folders exist in your project
Search for code patterns - Find functions, classes, or any text across all files
Find files by criteria - Locate files by name, size, or modification date
Read file content efficiently - Get exactly the content you need without loading entire files

🛠️ Tools Overview

4 Unix-powered tools for efficient local file system research. Each tool leverages native commands for maximum performance while providing token-optimized output for AI agents.

Quick Reference

Tool	Purpose	Unix Command	Pagination
`local_view_structure`	Directory exploration	`ls` / `fs.readdir`	✅ Yes
`local_ripgrep`	Pattern search	`ripgrep`	✅ Auto
`local_find_files`	File discovery	`find`	✅ Yes
`local_fetch_content`	Content reading	`fs.readFile`	✅ Yes

All tools support bulk operations - process multiple queries in parallel for maximum efficiency.

1. `local_view_structure` - Directory Exploration

Explore your codebase structure using fast directory listing. Get an organized view of your files and folders before diving into content.

What It Does

Shows you the files and directories in your project with useful details like size, date, and permissions. You can sort by name, size, time, or file extension. Works great for both shallow and deep exploration of your project structure, and can display results as a visual tree or simple list.

Best For

Understanding how your project is organized
Finding the largest files or most recently changed files
Identifying important entry points and key directories
Getting a quick overview before diving deeper
Exploring complex monorepo structures

Main Options

Path: Which directory to explore (required)
Depth: How many levels deep to look (1 for just that folder, up to 5 for deep exploration)
Sort by: Order results by name, size, time, or file extension
Tree view: Show results as a visual tree (enabled by default)
Filters: Show only files, only directories, or match specific patterns
Hidden files: Include or exclude hidden files (starting with .)
Details: Show extra information like file sizes and dates
Pagination: For large directories, control how much to show at once

Example Workflow

Start with a high-level overview, then drill down into specific directories that interest you. If you're looking for large files, sort by size. If you want to see recent activity, sort by time.

2. `local_ripgrep` - Pattern Search

Search for code patterns across your entire codebase using powerful ripgrep. Find functions, classes, TODOs, or any text pattern in milliseconds.

Note: Requires ripgrep to be installed on your system. Already available in Claude Code and Cursor!

What It Does

Searches through all your code files for specific text or patterns. You can use simple text searches or powerful regex patterns to find exactly what you're looking for. It shows you where matches are found with surrounding context lines to help you understand the code. Can search across all files or filter by file type, and handles huge codebases efficiently.

Best For

Finding where functions or classes are defined or used
Locating TODO comments, error messages, or specific strings
Understanding how code patterns are used across your project
Quick discovery of files without reading everything
Extracting specific code sections

Main Options

Search basics:

Pattern: What to search for (can be text or regex pattern)
Path: Where to search (the root directory)
Files only: Just list which files contain matches (very fast for discovery)

Search modes:

Discovery mode: Find which files contain your pattern (fastest)
Detailed mode: Show actual matches with context
Paginated mode: Handle large result sets

Pattern matching:

Fixed string: Search for exact text (not regex)
Smart case: Lowercase searches are case-insensitive, mixed case is exact
Case insensitive: Always ignore case
Whole word: Only match complete words

Filters:

File type: Limit to specific file types (like TypeScript, JavaScript, Python)
Include/exclude patterns: Filter by file name patterns
Exclude directories: Skip folders like node_modules

Context:

Context lines: Show surrounding lines for better understanding
Max matches per file: Limit results to keep output manageable
Pagination: Split large results into pages

Recommended Workflow

Start with a discovery search to find which files contain your pattern. This is very fast because it doesn't show the actual matches. Then, once you know which files are relevant, do a detailed search with context lines to see the actual code. Finally, if you need more context, read the full file content.

3. `local_find_files` - Advanced File Discovery

Find files using powerful filtering options. Search by name, size, modification time, permissions—all in one query.

What It Does

Locates files based on various criteria like name patterns, how recently they were modified, their size, and their permissions. You can combine multiple criteria to narrow down exactly what you're looking for. Perfect for finding config files, large files eating up space, or files that were recently changed.

Best For

Finding configuration files (like *.config.js or .env files)
Locating large files that might be taking up too much space
Finding files that were recently modified (great for tracking changes)
Discovering executable scripts in your project
Combining multiple search criteria for precise results

Main Options

File name matching:

Name: Search by file name (case-sensitive)
Iname: Case-insensitive name search
Names: Search for multiple name patterns at once
Regex: Use regular expressions for complex patterns
Path pattern: Match the full file path

Time-based filters:

Modified within: Files changed in the last X days/hours/minutes (e.g., "7d" for 7 days)
Modified before: Files changed more than X time ago
Accessed within: Files opened recently

Size filters:

Size greater than: Files larger than X (e.g., "10M" for 10 megabytes)
Size less than: Files smaller than X (e.g., "100k" for 100 kilobytes)

File attributes:

Type: Filter by files, directories, or symbolic links
Permissions: Match specific permission patterns
Executable: Find only executable files
Readable/Writable: Filter by accessibility
Empty: Find empty files or directories

Search control:

Max depth: How many levels deep to search
Limit: Maximum number of results
Exclude directories: Skip certain folders (like node_modules)
Details: Include size, dates, and permissions in results

Usage Examples

Find recent TypeScript files: Search for files ending in .ts that were modified in the last week, and show their details.

Find large files: Look for files bigger than 1MB and sort them by size to see what's taking up space.

Find config files: Search for multiple config file patterns in your project, but don't go too deep into subdirectories.

Find executable scripts: Locate all executable files in a scripts folder.

4. `local_fetch_content` - Smart Content Reading

Read file content efficiently with automatic optimization. Get exactly what you need—full files or specific sections matching a pattern.

What It Does

Reads files intelligently based on what you need. You can read entire small files, or extract just specific sections from larger files by searching for a pattern. It automatically optimizes the content by removing unnecessary comments and whitespace to reduce token usage. Handles large files gracefully with pagination so you never run out of memory.

Best For

Reading configuration files
Extracting specific functions or classes after finding them with search
Getting file content with minimal token usage
Reading sections of large files without loading the entire thing
Following up on search results to see more context

Main Options

Reading modes (choose one):

Full content: Read the entire file (best for small config files)
Match string: Extract only sections that contain specific text (most efficient)
Match string context lines: How many lines before and after the match to include (default is 5)

Optimization:

Minified: Remove comments and extra whitespace to save tokens (enabled by default, but turn off for JSON/YAML files where formatting matters)

Pagination (for large files):

Character length: Maximum characters to return at once
Character offset: Where to start reading from (for reading files in chunks)

Usage Examples

Read a config file: Get the full contents of a small configuration file. Great for package.json, tsconfig.json, etc.

Extract a specific function: After searching for a function name, extract just that function with surrounding context. This is the most efficient way to read code.

Read a large file in chunks: For huge files, read them page by page using pagination. Specify how many characters to read and where to start.

Read without minification: When reading structured data like JSON or YAML, disable minification to preserve the formatting.

Integration with ripgrep

Works great with ripgrep! First use ripgrep to find where code appears, then use the match information to extract just that section with fetch_content. This gives you focused results without reading entire files.

💡 Usage Examples

Complete Workflows

Understanding a New Codebase

When exploring a new project, start by getting the big picture, then drill down:

Get project structure: View the directory tree sorted by size to understand the major components
Find entry points: Look for common entry files like index.ts, main.ts, or app.ts
Search for key patterns: Use ripgrep to find where classes and functions are exported
Read key files: Once you've identified important files, read their contents

Finding and Fixing a Bug

When tracking down a bug, follow the trail:

Find error message: Search for the error text across your codebase with context lines
Find related files: Look for files with relevant names that were recently modified
Read implementation: Extract the specific function or class that's causing issues

Code Refactoring

When renaming or changing code across multiple files:

Find all usages: First, discover which files use the old function name
Get detailed matches: Then search again with context to see how it's being used
Read affected files: Extract the relevant sections from each file to understand the changes needed

Performance Analysis

When investigating performance or code quality:

Find large files: View directory structure sorted by size to identify potential problem areas
Find recently modified files: Look for files changed in the last day to understand recent activity
Search for TODO comments: Find TODO, FIXME, or HACK comments to see what needs attention

🔒 Security Features

Multi-layer security prevents unauthorized access and command injection.

Command Whitelisting

Only safe Unix commands are allowed:

rg (ripgrep) - fast pattern search (used by local_ripgrep)
find - file discovery (used by local_find_files)
ls - directory listing (used by local_view_structure)

Path Protection

Workspace restriction: All operations limited to WORKSPACE_ROOT directory
Path traversal prevention: .. and symlink attacks blocked
Symlink validation: Links must stay within workspace boundaries
Absolute path resolution: All paths canonicalized before use

Automatic Sensitive File Filtering

Blocks access to sensitive files and directories:

Secrets & Credentials:

.env, .env.* files
credentials.json, secrets.*
Private keys: *.pem, *.key, id_rsa
Certificates: *.crt, *.cer
config.json with sensitive data

Dependencies:

node_modules/, vendor/, __pycache__/
.yarn/, .pnp.*

Build Artifacts:

dist/, build/, out/, target/
coverage/, .next/, .nuxt/
*.min.js, *.bundle.js

Version Control:

.git/, .svn/, .hg/, .bzr/

IDE & System:

.vscode/, .idea/, .DS_Store

Resource Limits

Timeout: Configurable per operation (default: 30 seconds)
Output size: Configurable maximum per command (default: 10MB)
Memory: Configurable global limit with per-operation tracking (default: 100MB)
Token limits: Auto-pagination to stay within MCP limits
Cache: Configurable TTL to prevent memory exhaustion (default: 15 minutes)

Command Injection Prevention

No shell interpretation - commands executed directly
All arguments validated and escaped
No command chaining (;, &&, ||, |) allowed
Input sanitization with strict regex patterns

⚙️ Configuration

Environment Variables

You can customize the behavior with these environment variables:

WORKSPACE_ROOT (required): The root directory for all file operations. Defaults to current working directory.
DEBUG (optional): Set to "true" to enable detailed logging for troubleshooting
CACHE_TTL (optional): How long to cache results in seconds (default: 900 = 15 minutes)
MEMORY_LIMIT (optional): Maximum memory usage in megabytes (default: 100)

MCP Configuration

Claude Desktop (macOS/Windows)

Edit your Claude Desktop configuration file at: ~/Library/Application Support/Claude/claude_desktop_config.json

Add the local-explorer MCP server with your project path as the WORKSPACE_ROOT.

Claude Code

Edit .clauderc in your project directory. You can use ${workspaceFolder} as the WORKSPACE_ROOT to automatically use the current project directory.

Cursor

Edit your Cursor settings and add the MCP server configuration. Like Claude Code, you can use ${workspaceFolder} for the current project directory.

⚡ Performance & Optimization

Token Optimization Strategies

Start with discovery - Use filesOnly mode to find files before reading content
Use pattern matching - Extract specific sections instead of full files
Enable minification - Significant token reduction for code files
Bulk operations - Process multiple queries in parallel
Smart pagination - Automatic chunking of large results

Caching

Configurable TTL for repeated operations (default: 15 minutes)
Memory-efficient - LRU eviction when limit reached
Automatic - No configuration needed
Cache keys include all parameters for accuracy

Memory Management

Global limit: Configurable total across all operations (default: 100MB)
Per-operation tracking: Each query monitored independently
Automatic cleanup: Old results freed when limit approached
Graceful degradation: Pagination kicks in for large results

🛠️ Development

Setup

# Clone repository git clone https://github.com/yourusername/local-explorer-mcp.git cd local-explorer-mcp # Install dependencies npm install # or yarn install # Build npm run build # Run tests npm test # Watch mode npm run test:watch # Coverage report npm run test:coverage

Project Structure

local-explorer-mcp/ ├── src/ │ ├── commands/ # Unix command builders │ │ ├── BaseCommandBuilder.ts │ │ ├── FindCommandBuilder.ts │ │ ├── LsCommandBuilder.ts │ │ └── RipgrepCommandBuilder.ts │ ├── tools/ # MCP tool implementations │ │ ├── local_fetch_content.ts │ │ ├── local_find_files.ts │ │ ├── local_ripgrep.ts │ │ ├── local_view_structure.ts │ │ └── toolsManager.ts │ ├── security/ # Security validation layers │ │ ├── commandValidator.ts │ │ ├── pathValidator.ts │ │ ├── ignoredPathFilter.ts │ │ └── securityConstants.ts │ ├── utils/ # Utility functions │ │ ├── bulkOperations.ts │ │ ├── cache.ts │ │ ├── pagination.ts │ │ ├── minifier.ts │ │ └── memoryManager.ts │ ├── scheme/ # Zod validation schemas │ ├── types.ts # TypeScript type definitions │ ├── constants.ts # Configuration constants │ └── index.ts # MCP server entry point ├── tests/ # Test files ├── dist/ # Compiled output ├── rollup.config.js # Build configuration ├── tsconfig.json # TypeScript config ├── package.json └── README.md

Scripts

# Development npm run build:dev # Build without obfuscation npm run build:watch # Watch mode # Testing npm test # Run all tests npm run test:watch # Watch mode npm run test:coverage # Coverage report npm run test:ui # Visual test UI # Code Quality npm run lint # Check linting npm run lint:fix # Fix linting issues npm run format # Format code npm run format:check # Check formatting # Debugging npm run debug # Run with MCP inspector

Testing

Comprehensive test suite includes:

Unit tests: Individual tool and utility functions
Integration tests: End-to-end tool workflows
Security tests: Command injection, path traversal attacks
Performance tests: Token usage, pagination, caching
Edge cases: Symlinks, large files, special characters

# Run specific test file npx vitest run tests/tools/local_ripgrep.test.ts # Run tests matching pattern npx vitest run -t "ripgrep" # Watch mode for TDD npm run test:watch

Contributing

We welcome contributions! Please see our Contributing Guidelines.

Quick Start

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Add tests for new functionality
Run tests: npm test
Run linter: npm run lint:fix
Commit: git commit -m 'Add amazing feature'
Push: git push origin feature/amazing-feature
Open a Pull Request

Development Guidelines

Write tests for all new features
Follow TypeScript strict mode
Use meaningful variable and function names
Add JSDoc comments for public APIs
Keep functions small and focused
Validate all user inputs with Zod schemas
Follow security best practices

🐛 Troubleshooting

Common Issues

"ripgrep not found"

Problem: local_ripgrep tool fails with "ripgrep not found" error.

Solution: Install ripgrep:

# macOS brew install ripgrep # Ubuntu/Debian apt-get install ripgrep # Windows choco install ripgrep # Or use package manager npm install -g ripgrep

"Permission denied" errors

Problem: Operations fail with permission errors.

Solution: Check WORKSPACE_ROOT permissions and ensure the directory is accessible:

# Check permissions ls -la /path/to/workspace # Fix permissions if needed chmod 755 /path/to/workspace

"Path outside workspace" errors

Problem: Tool rejects valid paths with "outside workspace" error.

Solution:

Ensure WORKSPACE_ROOT is set correctly
Use absolute paths or paths relative to workspace root
Check for symlinks pointing outside workspace

MCP server not starting

Problem: Server fails to start in Claude Desktop/Code.

Solution:

Check Node.js version: node --version (must be >=18.0.0)
Clear npm cache: npm cache clean --force
Reinstall: npm install -g local-explorer-mcp@latest
Check MCP configuration syntax in settings
View logs in Claude Desktop: Help → Show Logs

"Token limit exceeded" warnings

Problem: Responses are truncated with token warnings.

Solution:

Use filesOnly: true for discovery first
Enable pagination with charLength and charOffset
Use matchString instead of fullContent for large files
Increase maxMatchesPerFile limit in ripgrep

Slow performance

Problem: Operations take longer than expected.

Solution:

Use more specific search patterns
Limit search scope with path parameter
Exclude large directories with excludeDir
Use depth: 1 for shallow directory scans
Enable caching (on by default)

Debug Mode

Enable debug logging for troubleshooting by setting the DEBUG environment variable to "true" in your MCP server configuration.

Debug logs show:

Command execution details
Path resolution steps
Security validation results
Performance metrics
Cache hit/miss statistics

Getting Help

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Full Docs

📊 Feature Comparison

Feature	local-explorer	Standard File Reading	Improvement
Search speed	Fast (native commands)	Slower (sequential reads)	Significantly faster
Token usage	Optimized	Higher	Token reduction
Parallel operations	✅ Yes	❌ Sequential	Much faster
Pattern matching	✅ Regex + context	❌ Manual	Native support
Security validation	✅ Multi-layer	⚠️ Basic	Enterprise-grade
Token optimization	✅ Auto	❌ Manual	Automatic
Large file handling	✅ Pagination	❌ Memory issues	Reliable
Caching	✅ Configurable TTL	❌ None	Faster repeat

🎯 Use Cases

For AI Assistants

Codebase exploration and understanding
Finding function/class definitions and usages
Locating configuration files
Analyzing project structure
Code refactoring research
Bug investigation
Documentation generation
Security auditing

For Developers

Fast code search without IDE
CI/CD pipeline integration
Automated code analysis
Project scaffolding tools
Documentation generators
Code quality tools
Migration helpers

Extensibility

Build Your Own MCP Server! These tools and patterns can be reused for:

Custom file system exploration workflows
Domain-specific search tools
Automated analysis pipelines
Integration with other MCP servers
Any use case requiring intelligent local file system access

📄 License

MIT License - see LICENSE.md for details.

🙏 Acknowledgments

Inspired by octocode-mcp research methodologies
Built on Model Context Protocol by Anthropic
Powered by ripgrep for blazing-fast search

🔗 Links

npm: local-explorer-mcp

Made with ❤️ by the Octocode Team

Built for AI assistants, optimized for developers.

local-explorer-mcp

🎯 Built-in AI Prompts

📐 generate_architecture_markdown - Create ARCHITECTURE.md

🤖 generate_agents_markdown - Create AGENTS.md

How to Use

Why local-explorer-mcp?

Table of Contents

📦 Installation

For Claude Desktop / Claude Code

Requirements

🚀 Quick Start

🛠️ Tools Overview

Quick Reference

1. local_view_structure - Directory Exploration

What It Does

Best For

Main Options

Example Workflow

2. local_ripgrep - Pattern Search

What It Does

Best For

Main Options

Recommended Workflow

3. local_find_files - Advanced File Discovery

What It Does

Best For

Main Options

Usage Examples

4. local_fetch_content - Smart Content Reading

What It Does

Best For

Main Options

Usage Examples

Integration with ripgrep

💡 Usage Examples

Complete Workflows

Understanding a New Codebase

Finding and Fixing a Bug

Code Refactoring

Performance Analysis

🔒 Security Features

Command Whitelisting

Path Protection

Automatic Sensitive File Filtering

Resource Limits

Command Injection Prevention

⚙️ Configuration

Environment Variables

MCP Configuration

Claude Desktop (macOS/Windows)

Claude Code

Cursor

⚡ Performance & Optimization

Token Optimization Strategies

Caching

Memory Management

🛠️ Development

Setup

Project Structure

Scripts

Testing

Contributing

Quick Start

Development Guidelines

🐛 Troubleshooting

Common Issues

"ripgrep not found"

"Permission denied" errors

"Path outside workspace" errors

MCP server not starting

"Token limit exceeded" warnings

Slow performance

Debug Mode

Getting Help

📊 Feature Comparison

🎯 Use Cases

For AI Assistants

For Developers

Extensibility

📄 License

📐 `generate_architecture_markdown` - Create ARCHITECTURE.md

🤖 `generate_agents_markdown` - Create AGENTS.md

1. `local_view_structure` - Directory Exploration

2. `local_ripgrep` - Pattern Search

3. `local_find_files` - Advanced File Discovery

4. `local_fetch_content` - Smart Content Reading