IntelliDiff MCP Server

README.md•8.62 kB

# IntelliDiff MCP Server An intelligent file and folder comparison MCP server with advanced text normalization and duplicate detection capabilities. ## Features - **File Comparison**: CRC32-based exact comparison and smart text comparison with normalization - **Folder Comparison**: Recursive directory comparison with orphan detection - **Duplicate Detection**: Find identical files within directories - **Text Normalization**: Handle case, whitespace, tabs, line endings, and Unicode differences - **Line-Level Analysis**: Detailed diff output with line ranges and targeted file reading - **Clean Output**: Markdown-formatted text responses instead of JSON bloat - **Security**: Workspace root validation prevents path traversal attacks - **Performance**: Streaming for large files, configurable limits, symlink loop prevention ## Installation ```bash # Clone or download the project cd intellidiff-mcp # Install with uv uv init --python 3.12 uv add "fastmcp>=2.11" # Run the server uv run python intellidiff_server.py /path/to/workspace/root ``` ## Project Structure The server is built with a clean modular architecture: - **`intellidiff_server.py`** (52 lines) - Main server entry point and tool registration - **`workspace_security.py`** - Path validation and workspace boundary enforcement - **`file_operations.py`** - Core file utilities (CRC32, text detection, normalization) - **`tools.py`** - Individual MCP tool implementations - **`folder_operations.py`** - Folder comparison and duplicate detection logic ## MCP Configuration ### Local/stdio Configuration ```json { "mcpServers": { "intellidiff": { "type": "stdio", "command": "uv", "args": ["run", "--directory", "/path/to/intellidiff-mcp", "python", "intellidiff_server.py", "/workspace/root"] } } } ``` ### Local/stdio Configuration with Environment Variables ```json { "mcpServers": { "intellidiff": { "type": "stdio", "command": "uv", "args": ["run", "--directory", "/path/to/intellidiff-mcp", "python", "intellidiff_server.py", "/workspace/root"], "env": { "INTELLIDIFF_MAX_TEXT_SIZE": "5242880", "INTELLIDIFF_MAX_BINARY_SIZE": "1073741824", "INTELLIDIFF_MAX_DEPTH": "15", "INTELLIDIFF_CHUNK_SIZE": "32768" } } } } ``` ### Remote/HTTP Configuration ```json { "mcpServers": { "intellidiff": { "type": "http", "url": "http://localhost:8000/mcp/" } } } ``` Place this configuration in: - VS Code: `.vscode/mcp.json` (project) or user settings - Claude Desktop: `claude_desktop_config.json` - Cursor: `.cursor/mcp.json` (project) or `~/.cursor/mcp.json` (user) - LM Studio: `~/.lmstudio/mcp.json` ## Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `INTELLIDIFF_MAX_TEXT_SIZE` | 10485760 (10MB) | Maximum size for text file comparison | | `INTELLIDIFF_MAX_BINARY_SIZE` | 1073741824 (1GB) | Maximum size for binary file CRC32 | | `INTELLIDIFF_MAX_DEPTH` | 10 | Maximum directory recursion depth | | `INTELLIDIFF_CHUNK_SIZE` | 65536 (64KB) | File reading chunk size | ## Tools ### `validate_workspace_path` Validate that a path is within the workspace root. **Parameters:** - `path` (string): Path to validate ### `get_file_hash` Get CRC32 hash and basic information about a file. **Parameters:** - `file_path` (string): Path to the file ### `compare_files` Compare two files with various modes and options. **Parameters:** - `left_path` (string): Path to first file - `right_path` (string): Path to second file - `mode` (string): Comparison mode - "exact", "smart_text", or "binary" - `ignore_blank_lines` (boolean): Skip empty lines during comparison - `ignore_newline_differences` (boolean): Normalize line endings - `ignore_whitespace` (boolean): Ignore leading/trailing whitespace - `ignore_case` (boolean): Case-insensitive comparison - `normalize_tabs` (boolean): Convert tabs to spaces - `unicode_normalize` (boolean): Apply Unicode NFKC normalization ### `compare_folders` Compare two folder structures recursively. **Parameters:** - `left_path` (string): Path to first folder - `right_path` (string): Path to second folder - `max_depth` (integer): Maximum recursion depth (default: from env var) - `include_binary` (boolean): Include binary files in comparison - `comparison_mode` (string): "exact" or "smart_text" ### `find_identical_files` Find files with identical content within a folder. **Parameters:** - `folder_path` (string): Path to folder to scan - `max_depth` (integer): Maximum recursion depth (default: from env var) ### `read_file_lines` Read specific line ranges from a text file with optional context. **Parameters:** - `file_path` (string): Path to the text file - `start_line` (integer): Starting line number (1-based, default: 1) - `end_line` (integer): Ending line number (1-based, default: end of file) - `context_lines` (integer): Additional context lines before/after range (default: 0) ## Usage Examples ### Compare Two Files ```python # Exact comparison - clean markdown output result = await client.call_tool("compare_files", { "left_path": "file1.txt", "right_path": "file2.txt", "mode": "exact" }) print(result.content[0].text) # Output: ✅ **Exact Comparison** # 📁 Left: file1.txt (CRC32: abc123) # 📁 Right: file2.txt (CRC32: abc123) # 🔍 Result: Identical # Smart text comparison with normalization result = await client.call_tool("compare_files", { "left_path": "file1.txt", "right_path": "file2.txt", "mode": "smart_text", "ignore_case": True, "ignore_whitespace": True, "normalize_tabs": True }) print(result.content[0].text) # Output: ✅ **Smart Text Comparison - Identical** # 📁 Left: file1.txt (1.2KB) # 📁 Right: file2.txt (1.3KB) # 🔍 Result: Identical (normalized: case, whitespace, tabs) ``` ### Compare Folders ```python result = await client.call_tool("compare_folders", { "left_path": "folder_a", "right_path": "folder_b", "max_depth": 5 }) # Folder comparison returns structured data for programmatic access summary = result.data["summary"] orphans = result.data["orphans"] identical_files = result.data["identical_files"] ``` ### Find Duplicates ```python result = await client.call_tool("find_identical_files", { "folder_path": "my_folder", "max_depth": 10 }) # Duplicate detection returns structured data for analysis duplicates = result.data["duplicates"] wasted_bytes = result.data["summary"]["total_wasted_bytes"] ``` ### Read Specific Lines ```python # Read lines 10-20 with 2 lines of context result = await client.call_tool("read_file_lines", { "file_path": "my_file.txt", "start_line": 10, "end_line": 20, "context_lines": 2 }) # Clean line-numbered output with >>> markers for requested range print(result.content[0].text) # Output: 8| function setup() { # 9| console.log("Starting..."); # >>> 10| const data = loadData(); # >>> 11| if (!data) { # >>> 12| throw new Error("No data"); # >>> 13| } # 14| } ``` ### Working with Diff Results ```python # Compare files and get detailed diff information result = await client.call_tool("compare_files", { "left_path": "file1.txt", "right_path": "file2.txt", "mode": "smart_text" }) # Access structured diff data if not result.structured_content["identical"]: change_summary = result.structured_content["change_summary"] # Get affected line ranges left_ranges = change_summary["line_ranges"]["left_affected"] right_ranges = change_summary["line_ranges"]["right_affected"] # Read specific sections that changed for range_info in left_ranges: lines_result = await client.call_tool("read_file_lines", { "file_path": "file1.txt", "start_line": range_info["start"], "end_line": range_info["end"], "context_lines": 3 }) print(f"Changed section: {lines_result.content[0].text}") ``` ## Security - All file paths are validated against the workspace root - Path traversal attacks are prevented through path resolution - Symlink loops are detected and avoided - File size limits prevent memory exhaustion - Read-only operations only ## Performance - Streaming I/O for large files - Early exit on size mismatches - CRC32 caching for repeated operations - Configurable chunk sizes and limits - Progress reporting for large operations ## License MIT License - see LICENSE file for details.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/batteryshark/mcp-intellidiff'

If you have feedback or need assistance with the MCP directory API, please join our Discord server