GDAL MCP

0022-workspace-scoping-and-access-control.md•11.5 KiB

# ADR-0022: Workspace Scoping and Access Control **Status**: Accepted **Date**: 2025-09-30 **Deciders**: Jordan Godau **Tags**: #security #access-control #mcp #workspaces ## Context MCP servers often run with elevated privileges and can access any file on the system. This poses security risks: 1. **Unintended data access**: Analysts may have sensitive data in various directories 2. **Path traversal**: Malicious or accidental access to system files 3. **Multi-tenant concerns**: Different users/projects should have isolated data access 4. **Compliance**: Data governance may require strict access boundaries **Current state**: GDAL MCP tools accept any file path without validation. A tool could potentially access: - `/etc/passwd` (system files) - `/home/other_user/private/` (other users' data) - Any path the server process can read **MCP specification** defines a "roots" capability where clients can declare filesystem boundaries, but: - Clients expose roots to servers (optional) - Servers must respect and enforce boundaries - Defense in depth requires server-side validation ## Decision **Implement workspace scoping** with configurable allowed directories: ### 1. **Configuration Sources** (Priority Order) **Priority 1**: Environment variable (highest) ```bash GDAL_MCP_WORKSPACES="/data/projects:/home/user/gis:/mnt/shared" ``` **Priority 2**: Config file `gdal-mcp.yaml` (if exists in CWD or `~/.config/gdal-mcp/`) ```yaml workspaces: - /data/projects - /home/user/gis - /mnt/shared/geospatial ``` **Priority 3**: No configuration = allow all (backward compatible, but warn in logs) ### 2. **Path Validation Rules** ```python def validate_path(path: str, workspaces: list[Path]) -> Path: """Validate and resolve path against allowed workspaces. Args: path: User-provided path (relative or absolute) workspaces: List of allowed workspace directories Returns: Resolved absolute path Raises: ToolError: If path is outside allowed workspaces """ # Resolve to absolute path (handles .., symlinks) resolved = Path(path).resolve() # If no workspaces configured, allow all (backward compatible) if not workspaces: return resolved # Check if path is within any allowed workspace for workspace in workspaces: try: resolved.relative_to(workspace) return resolved # ✓ Path is allowed except ValueError: continue # Path is outside all workspaces - DENY raise ToolError( f"Access denied: '{path}' is outside allowed workspaces. " f"Allowed directories: {', '.join(str(w) for w in workspaces)}" ) ``` ### 3. **Integration Points** **Every tool that accesses files must validate paths:** ```python # raster.info async def _info(uri: str, ctx: Context | None = None) -> Info: # Validate path before opening validated_path = validate_path(uri, get_workspaces()) with rasterio.open(str(validated_path)) as ds: ... ``` **Validation happens in core logic functions** (not MCP wrappers) to ensure: - Direct function calls (tests) are also validated - Consistent security regardless of entry point ### 4. **MCP Roots Capability** (Optional Future Enhancement) FastMCP may expose `roots` capability to declare workspaces to clients: ```python # Future: Expose workspaces as MCP roots mcp.add_roots([ Root(uri=f"file://{ws}", name=ws.name) for ws in get_workspaces() ]) ``` Benefits: - Clients can show workspace boundaries in UI - Better UX (users know where they can access data) - Standards-compliant MCP implementation ## Rationale ### Why Environment Variables? **Pros**: - ✅ Simple deployment (no config files to manage) - ✅ Works with Docker, systemd, MCP client configs - ✅ Easy to override per-instance - ✅ Standard practice (12-factor apps) **Cons**: - ⚠️ Limited to simple list (no per-workspace metadata) - ⚠️ Not ideal for complex configs ### Why Config File? **Pros**: - ✅ Structured configuration (YAML) - ✅ Can add per-workspace metadata (read-only, labels, etc.) - ✅ Version-controllable **Cons**: - ⚠️ Requires file management - ⚠️ Harder to override per-instance ### Why Both? **Best of both worlds**: - Env var for simple/production deployments - Config file for complex scenarios - Env var takes precedence (easy override) ### Why "No Config = Allow All"? **Backward compatibility**: - ✅ Existing deployments continue working - ✅ Development/testing doesn't require setup - ✅ Users can opt-in to security **But**: - ⚠️ Log clear WARNING when running unrestricted - ⚠️ Documentation emphasizes production should configure workspaces ### Why Validate in Core Functions? **Defense in depth**: ```python # Validation happens here (core logic) async def _info(uri: str, ctx: Context | None = None): validated_path = validate_path(uri, get_workspaces()) ... # Not here (MCP wrapper) async def info(uri: str, ctx: Context | None = None): return await _info(uri, ctx) # Already validated ``` **Benefits**: - ✅ Tests also enforce security - ✅ Direct function calls are secured - ✅ No way to bypass validation ## Implementation Pattern ### Configuration Loader ```python # src/config.py from pathlib import Path import os import yaml def get_workspaces() -> list[Path]: """Load allowed workspace directories from config. Priority: 1. GDAL_MCP_WORKSPACES env var (colon-separated) 2. gdal-mcp.yaml config file 3. Empty list (allow all, with warning) """ # Priority 1: Environment variable env_workspaces = os.getenv("GDAL_MCP_WORKSPACES") if env_workspaces: return [ Path(ws.strip()).resolve() for ws in env_workspaces.split(":") if ws.strip() ] # Priority 2: Config file config_paths = [ Path.cwd() / "gdal-mcp.yaml", Path.home() / ".config/gdal-mcp/config.yaml", ] for config_path in config_paths: if config_path.exists(): with open(config_path) as f: config = yaml.safe_load(f) if "workspaces" in config: return [ Path(ws).resolve() for ws in config["workspaces"] ] # Priority 3: No config - allow all (with warning) import logging logging.warning( "No workspace configuration found. All paths allowed. " "For production, set GDAL_MCP_WORKSPACES or create gdal-mcp.yaml" ) return [] ``` ### Path Validator ```python # src/validation.py from pathlib import Path from fastmcp.exceptions import ToolError from src.config import get_workspaces def validate_path(path: str, workspaces: list[Path] | None = None) -> Path: """Validate path against allowed workspaces.""" if workspaces is None: workspaces = get_workspaces() resolved = Path(path).resolve() # No restrictions if no workspaces configured if not workspaces: return resolved # Check each workspace for workspace in workspaces: try: resolved.relative_to(workspace) return resolved # ✓ Allowed except ValueError: continue # ✗ Denied raise ToolError( f"Access denied: '{path}' is outside allowed workspaces.\n" f"Allowed directories:\n" + "\n".join(f" - {ws}" for ws in workspaces) + "\n\nConfigure workspaces via GDAL_MCP_WORKSPACES environment variable." ) ``` ### Tool Integration ```python # src/tools/raster/info.py from src.middleware import validate_path async def _info(uri: str, band: int | None = None, ctx: Context | None = None): # Validate before any file access validated_path = validate_path(uri) if ctx: await ctx.info(f"📂 Opening raster: {validated_path}") with rasterio.open(str(validated_path)) as ds: ... ``` ## Security Properties ### Threat Model | Threat | Mitigation | |--------|-----------| | **Path traversal** | Resolve symbolic links, normalize `..` | | **Absolute paths** | Validate against workspace roots | | **Relative paths** | Resolve to absolute before validation | | **Symlinks** | `.resolve()` follows links to real path | | **Race conditions** | Path validated at function entry | ### Attack Scenarios **Scenario 1**: Attacker tries `../../etc/passwd` ```python validate_path("../../etc/passwd", [Path("/data/projects")]) # Resolves to: /etc/passwd # Check: /etc/passwd relative to /data/projects → ValueError # Result: ToolError (Access denied) ``` **Scenario 2**: Attacker uses absolute path ```python validate_path("/home/victim/private.tif", [Path("/data/projects")]) # Already absolute: /home/victim/private.tif # Check: Not under /data/projects → ValueError # Result: ToolError (Access denied) ``` **Scenario 3**: Attacker uses symlink ```python # /data/projects/link -> /etc/passwd validate_path("/data/projects/link", [Path("/data/projects")]) # Resolves to: /etc/passwd (via .resolve()) # Check: /etc/passwd relative to /data/projects → ValueError # Result: ToolError (Access denied) ``` **Scenario 4**: Legitimate access ```python validate_path("/data/projects/dem.tif", [Path("/data/projects")]) # Resolves to: /data/projects/dem.tif # Check: /data/projects/dem.tif relative to /data/projects → Success # Result: Path("/data/projects/dem.tif") ``` ## Consequences **Positive**: - ✅ **Security**: Prevent unauthorized file access - ✅ **Compliance**: Enforceable data boundaries - ✅ **Multi-tenant**: Isolate user workspaces - ✅ **Auditability**: Clear access control rules - ✅ **Backward compatible**: No config = allow all - ✅ **Flexible**: Env var or config file - ✅ **MCP-aligned**: Can expose via roots capability **Negative**: - ⚠️ **Configuration complexity**: Users must configure workspaces - ⚠️ **Error messages**: Users may be confused by access denials - ⚠️ **Performance**: Path resolution on every file access (minimal ~microseconds) **Neutral**: - Development/testing works without config (allow all) - Production deployments should always configure workspaces ## Alternatives Considered **Alternative 1**: No validation (current state) - ❌ Rejected: Security risk too high for production **Alternative 2**: Client-side only (MCP roots) - ❌ Rejected: Clients can't be trusted, server must enforce **Alternative 3**: Whitelist specific files - ❌ Rejected: Too restrictive, users work with dynamic datasets **Alternative 4**: Chroot/containerization only - ❌ Rejected: Defense in depth, not all deployments use containers **Alternative 5**: Read-only mode - ❌ Rejected: Tools need write access (convert, reproject) ## Migration Path **Phase 1** (This ADR): Implement validation with opt-in - Default: No restrictions (backward compatible) - Warning logged when unrestricted **Phase 2**: Documentation and examples - Update README with workspace configuration - Provide example configs for common scenarios - Add to QUICKSTART.md **Phase 3** (Future): Consider default restriction - Require explicit configuration for production - Default to CWD if no config (safer than allow-all) ## Related - **MCP Specification**: Roots capability - **ADR-0020**: Context-driven tool design (error messages via ToolError) - **ADR-0021**: LLM-optimized descriptions (explain access errors to LLMs) ## References - [MCP Roots Specification](https://modelcontextprotocol.io) - [OWASP Path Traversal](https://owasp.org/www-community/attacks/Path_Traversal) - [12-Factor App: Config](https://12factor.net/config)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Wayfinder-Foundry/gdal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

0022-workspace-scoping-and-access-control.md•11.5 KiB