README.md•14.7 kB
# MCP File Browser Server
A secure, high-performance Model Context Protocol (MCP) server that enables Large Language Models to safely browse and interact with local file systems. Built with security-first architecture and comprehensive access controls.
[](https://modelcontextprotocol.io)
[](./.specify/memory/constitution.md)
[](#status)
## 🚀 Features
### Core Capabilities
- **📁 Directory Browsing**: List directories with comprehensive metadata (size, permissions, modification date, type)
- **📄 File Reading**: Stream large files with encoding detection and size validation
- **🔍 File Search**: Search by name, extension, and content patterns
- **🔗 Symbolic Link Support**: Intelligent handling of symbolic links with clear metadata marking
- **⚡ High Performance**: Handles directories with 10,000+ files in under 2 seconds
### Security & Access Control
- **🛡️ Path Validation**: Prevents directory traversal attacks with canonical path resolution
- **🔐 Permission-Based Access**: Granular permission levels (read-only, read-write, admin)
- **📋 Allowlist/Denylist**: Configurable directory access controls
- **📊 Audit Logging**: Comprehensive logging of all file access attempts
- **🚫 System Protection**: Never exposes sensitive directories (/etc, /root, /home) by default
### Performance & Reliability
- **💾 Intelligent Caching**: Configurable caching for frequently accessed directories and metadata
- **🌊 Streaming Support**: Memory-efficient handling of large files
- **⚙️ Resource Management**: Configurable limits for memory usage, file size, and directory depth
- **📈 Monitoring**: Health checks and performance metrics
- **🔄 Concurrent Operations**: Support for multiple simultaneous LLM clients
## 📋 Requirements
### System Requirements
- **Operating System**: Linux, macOS, or Windows
- **Memory**: Minimum 100MB RAM (configurable)
- **Storage**: Minimal disk space for server and cache
- **Permissions**: Standard file system access permissions
### Dependencies
- MCP Protocol compliance
- File system access capabilities
- Logging and monitoring support
## 🛠️ Installation
### Prerequisites
```bash
# Ensure you have the necessary development tools
# (Installation instructions will be added as development progresses)
```
### Quick Start
```bash
# Clone the repository
git clone https://github.com/ntufar/mcp.git
cd mcp
# Install dependencies
npm install
# Configure the server
# (see docs/deployment.md for production configs, or use defaults for examples)
# Start the server
# Development (ts-node)
npm run dev
# Or build and run
npm run build && npm start
```
### LLM Integration Examples
You can try end-to-end examples with real LLM clients:
- Claude Desktop: see `examples/llm-integration/claude-desktop.md`
- OpenAI GPT-4: see `examples/llm-integration/openai-gpt.md`
- Local LLM (Ollama): see `examples/llm-integration/ollama-local.md`
Helper scripts:
```bash
# Run a complete demo that exercises the server end-to-end
npm run example:complete
# Run example integration tests (mock client over stdio)
npm run example:test
```
## ⚙️ Configuration
### Basic Configuration
```json
{
"server": {
"name": "mcp-file-browser",
"version": "1.0.0"
},
"security": {
"allowedPaths": ["/home/user/documents", "/home/user/projects"],
"deniedPaths": ["/etc", "/root", "/sys"],
"maxFileSize": "100MB",
"maxDirectoryDepth": 10
},
"performance": {
"cacheSize": "50MB",
"cacheTTL": 300,
"maxConcurrentOperations": 50
}
}
```
### Security Settings
- **allowedPaths**: Directories the server can access
- **deniedPaths**: Directories explicitly blocked
- **maxFileSize**: Maximum file size for operations
- **maxDirectoryDepth**: Maximum directory traversal depth
## 🔧 Usage
### MCP Protocol Integration
The server exposes the following MCP tools and resources:
#### Tools
- `list_directory` - List directory contents with metadata
- `read_file` - Read file contents with streaming support
- `search_files` - Search files by name, extension, or content
- `get_file_metadata` - Retrieve detailed file information
- `check_permissions` - Verify file/directory access permissions
#### Resources
- Directory listings with structured metadata
- File contents with encoding detection
- Search results with relevance scoring
- System health and performance metrics
### Example Usage
```javascript
// List directory contents
const directoryListing = await mcpClient.callTool('list_directory', {
path: '/home/user/projects',
includeHidden: false,
maxDepth: 2
});
// Read file with streaming
const fileContent = await mcpClient.callTool('read_file', {
path: '/home/user/documents/example.txt',
encoding: 'utf-8',
maxSize: '10MB'
});
// Search files
const searchResults = await mcpClient.callTool('search_files', {
query: 'configuration',
searchPath: '/home/user',
fileTypes: ['.json', '.yaml', '.conf']
});
```
## 🏗️ Architecture
### Core Components
- **Security Layer**: Path validation, permission checking, audit logging
- **File System Interface**: Abstraction layer for file operations
- **Caching Engine**: Intelligent metadata and content caching
- **MCP Protocol Handler**: Standard MCP tools and resources
- **Monitoring System**: Health checks and performance metrics
### Data Flow
```
LLM Client → MCP Protocol → Security Validation → File System → Cache → Response
```
### Architecture Diagram
```mermaid
flowchart LR
subgraph Client
A["LLM Client (Claude/GPT/Gemini/Ollama)"]
end
subgraph Protocol
B["MCP Protocol (stdio/jsonrpc)"]
end
subgraph Server[MCP File Browser Server]
direction TB
S1["MCPServer (Tools Router)"] --> MW["ErrorHandler Middleware"]
subgraph Services
direction LR
PV[PathValidationService]
PM[PermissionService]
DL[DirectoryService]
FL[FileService]
FS[FileSearchService]
MD[MetadataService]
RS[ResponseService]
VL[ValidationService]
end
MW --> PV
PV --> PM
PM --> DL
PM --> FL
PM --> FS
PM --> MD
DL --> RS
FL --> RS
FS --> RS
MD --> RS
subgraph Perf[Performance]
C[CacheService]
CI["CacheInvalidation (chokidar)"]
HM[HealthMonitoring]
RM[ResourceManager]
end
DL <--> C
FL <--> C
FS <--> C
MD <--> C
CI --> C
HM --> S1
RM --> S1
subgraph Security[Security & Audit]
AL[AuditLoggingService]
end
S1 --> AL
DL --> AL
FL --> AL
FS --> AL
MD --> AL
end
subgraph OS[Operating System]
FSYS[(File System)]
end
A --> B --> S1
DL --> FSYS
FL --> FSYS
FS --> FSYS
MD --> FSYS
```
## 🧪 Testing
### Comprehensive Testing Strategy
- **Test-Driven Development (TDD)**: Tests written before implementation for all file system operations
- **90%+ Code Coverage**: Mandatory coverage for file system operation modules
- **Mock File Systems**: Deterministic unit tests using mock file systems
- **Real LLM Integration**: Testing with actual LLM clients (Claude, GPT-4, Gemini, Qwen, OpenCode)
### Test Coverage
- **Unit Tests**: Individual component testing with mock file systems
- **Integration Tests**: End-to-end MCP protocol testing
- **Security Tests**: Path traversal, permission boundary testing, attack prevention
- **Performance Tests**: Large directory and file handling with specific benchmarks
- **Concurrent Tests**: Multi-client operation testing
- **Contract Tests**: API contract validation with OpenAPI specification
- **LLM Integration Tests**: Real LLM client workflow validation
### LLM Client Testing
- **Universal Compatibility**: Testing across all major LLM platforms
- **Workflow Validation**: Multi-step file analysis tasks with LLMs
- **Error Handling**: LLM error recovery and user experience validation
- **Performance Under Load**: Multiple LLM clients with concurrent operations
- **User Experience**: Response quality and task completion efficiency
### Running Tests
```bash
# Run all tests
npm test
# Run specific test suites
npm run test:unit
npm run test:integration
npm run test:security
npm run test:performance
npm run test:contract
npm run test:llm
# Run LLM integration tests
npm run test:llm:claude
npm run test:llm:gpt4
npm run test:llm:gemini
npm run test:llm:qwen
```
### Deployment
For production deployment (systemd/launchd, config hardening, security best practices), see:
- `docs/deployment.md`
## 📊 Performance Benchmarks
### Directory Operations
- **Small directories** (< 100 files): < 100ms
- **Medium directories** (100-1,000 files): < 500ms
- **Large directories** (1,000-10,000 files): < 2 seconds
- **Massive directories** (> 10,000 files): < 5 seconds
### File Operations
- **File metadata retrieval**: < 50ms
- **Small file reading** (< 1MB): < 100ms
- **Large file streaming** (> 100MB): Streaming with < 10MB memory usage
### Resource Usage
- **Memory footprint**: < 100MB for normal operations
- **Concurrent clients**: 50+ simultaneous operations
- **Cache efficiency**: 90%+ hit rate for frequently accessed data
## 🔒 Security Considerations
### Built-in Security Features
- **Path Traversal Protection**: Canonical path resolution prevents `../` attacks
- **Permission Respect**: Never bypasses operating system security
- **Audit Trail**: Comprehensive logging of all operations
- **Resource Limits**: Prevents resource exhaustion attacks
- **Input Validation**: Strict validation of all user inputs
### Security Best Practices
1. **Configure allowlists** for production deployments
2. **Monitor audit logs** for suspicious access patterns
3. **Regular security updates** and vulnerability scanning
4. **Principle of least privilege** for file system access
5. **Encryption at rest** for sensitive configuration data
## 🤝 Contributing
### Development Guidelines
We follow strict development principles outlined in our [Constitution](./.specify/memory/constitution.md):
- **Security-First Architecture**: All changes must enhance security
- **Test-Driven Development**: TDD mandatory for all file operations
- **Performance Standards**: Maintain established benchmarks
- **MCP Protocol Compliance**: Full adherence to MCP standards
### Getting Started
1. Read the [Constitution](./.specify/memory/constitution.md)
2. Review the [Specification](./specs/002-plan-implementation-for/spec.md) (64 requirements)
3. Study the [Implementation Tasks](./specs/002-plan-implementation-for/tasks.md) (T001-T064)
4. Follow the [Implementation Plan](./specs/002-plan-implementation-for/plan.md) with TDD approach
5. Set up development environment
6. Execute tasks in order: Setup → Tests → Implementation → Integration → Polish
7. Run comprehensive tests to ensure everything works
8. Submit a pull request with security review
### Task Execution Strategy
- **Testing-First**: Complete Phase 3.2 (16 test tasks) before any implementation
- **Parallel Execution**: 30+ tasks can run simultaneously (marked with [P])
- **Constitutional Compliance**: Follow security-first, TDD, and performance principles
- **LLM Integration**: Include real LLM client testing throughout development
### Code Quality Requirements
- **90%+ test coverage** for file system operations (constitutional requirement)
- **Test-Driven Development (TDD)** mandatory for all file operations
- **LLM integration testing** with real LLM clients
- **Security-focused code reviews** for all file access code
- **Performance benchmarks** maintained with each release
- **Comprehensive documentation** for all public APIs
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🆘 Support
### Documentation
- [Constitution](./.specify/memory/constitution.md) - Project governance and principles
- [Specification](./specs/002-plan-implementation-for/spec.md) - Detailed feature requirements (64 requirements)
- [Implementation Plan](./specs/002-plan-implementation-for/plan.md) - Comprehensive planning and LLM testing strategy
- [Implementation Tasks](./specs/002-plan-implementation-for/tasks.md) - 64 detailed implementation tasks (T001-T064)
- [Data Model](./specs/002-plan-implementation-for/data-model.md) - Complete entity definitions and relationships
- [API Contracts](./specs/002-plan-implementation-for/contracts/) - OpenAPI specification and contract tests
- [Quickstart Guide](./specs/002-plan-implementation-for/quickstart.md) - Step-by-step validation scenarios
- [API Documentation](./docs/api.md) - MCP tools and resources reference
### Getting Help
- **Issues**: Report bugs and feature requests on GitHub
- **Discussions**: Ask questions in GitHub Discussions
- **Security**: Report security issues privately via email
## 🗺️ Roadmap
### Phase 1: Foundation & Planning ✅
- [x] Project constitution and governance principles
- [x] Comprehensive specification (64 requirements)
- [x] Architecture design and data modeling
- [x] API contracts and testing strategy
- [x] LLM integration testing planning
### Phase 2: Core Implementation 🚧
- [ ] **64 Implementation Tasks Generated** (T001-T064)
- [ ] Testing infrastructure setup (TDD approach)
- [ ] Security layer implementation
- [ ] File system interface development
- [ ] MCP protocol integration
- [ ] Performance optimization
### Phase 3: Advanced Features 📋
- [ ] LLM client integration testing
- [ ] Advanced search capabilities
- [ ] Enhanced caching system
- [ ] Comprehensive test suite execution
### Phase 4: Enterprise Features 📋
- [ ] Multi-user support
- [ ] Advanced monitoring and analytics
- [ ] Enterprise security features
- [ ] Production deployment
## 📈 Status
**Current Status**: Tasks Generated - Ready for Implementation
- **Specification**: ✅ Complete (v1.0.0) - 64 requirements (16 functional + 48 testing)
- **Architecture**: ✅ Complete - Comprehensive data model and API contracts
- **Testing Strategy**: ✅ Complete - TDD approach with LLM integration testing
- **Implementation Plan**: ✅ Complete - 40-45 tasks with testing-first approach
- **Implementation Tasks**: ✅ Complete - 64 detailed tasks (T001-T064) generated
- **Core Implementation**: 🚀 Ready to Begin
- **Documentation**: 🚧 In Progress
---
**Built with ❤️ by [Nicolai Tufar](https://github.com/ntufar) for the MCP ecosystem**
*Empowering LLMs with secure, efficient file system access while maintaining the highest standards of security and performance.*