Kiwi MCP

kiwi-mcp
docs

TOOL_CHAIN_VALIDATION_DESIGN.md•28.2 KiB

# Tool Chain Validation Design: Package Manager Architecture **Date**: 2026-01-24 **Status**: ✅ Implemented **Goal**: Implement bun/npm-style integrity verification and parent→child validation for tool execution chains --- ## Executive Summary Transform the tool execution system into a **package-manager-style architecture** where: 1. **Only 2 primitives are hardcoded**: `subprocess` and `http_client` 2. **Everything else is data**: Runtimes, scripts, APIs are all data-driven manifests 3. **Parent tools define child validation**: Each tool specifies schemas for what children it accepts 4. **Hash verification at every step**: Tamper-resistant execution with content-addressed integrity 5. **Lockfile support**: Reproducible execution with pinned versions and integrities --- ## Current State Analysis ### What Exists | Component | Status | Location | | ------------------- | ----------------- | ------------------------------------------------------------- | | Chain resolution | ✅ Works | `ToolRegistry.resolve_chain()` → `resolve_executor_chain` RPC | | Content hash | ⚠️ Non-canonical | `sha256(str(manifest))` in `ToolRegistry.publish()` | | File hashes | ✅ Stored | `tool_version_files.sha256` | | Manifest validation | ✅ Layer 1 | `ToolValidator` in validators.py | | Runtime validation | ⚠️ Primitive only | `PrimitiveExecutor._validate_runtime_params()` | ### What's Missing | Requirement | Current State | | ------------------------------ | ----------------------- | | Hash verification at execution | ❌ No verification | | Parent→child schemas | ❌ Not implemented | | Canonical hashing | ❌ Uses `str(manifest)` | | Lockfile format | ❌ Not implemented | | Version constraints | ❌ No semver resolution | --- ## Architecture Design ### Core Concepts ``` ┌─────────────────────────────────────────────────────────────────┐ │ TOOL PACKAGE MODEL │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Tool Version = Immutable Package │ │ ├── Manifest (YAML/JSON) │ │ ├── Files[] (with individual sha256) │ │ ├── Integrity (canonical hash of manifest + file hashes) │ │ └── Child Schema (what children this tool accepts) │ │ │ │ Execution Chain = Dependency Graph │ │ user_script → python_runtime → subprocess │ │ ↑ ↑ ↑ │ │ (validated) (validated) (primitive) │ │ by parent by parent hardcoded │ │ │ │ Lockfile = Resolved + Pinned + Verified Chain │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ### The Two Hardcoded Primitives Only these two classes contain execution logic. Everything else is data: ```python # kiwi_mcp/primitives/subprocess.py - HARDCODED class SubprocessPrimitive: """Execute shell commands. The ONLY subprocess execution logic.""" async def execute(self, config: Dict, params: Dict) -> SubprocessResult: # Real subprocess.create_subprocess_exec() call ... # kiwi_mcp/primitives/http_client.py - HARDCODED class HttpClientPrimitive: """Make HTTP requests. The ONLY HTTP execution logic.""" async def execute(self, config: Dict, params: Dict) -> HttpResult: # Real httpx client call ... ``` ### Everything Else is Data Runtimes like `python_runtime` are NOT hardcoded - they're manifest data: ```yaml # python_runtime manifest (DATA, not code!) tool_id: python_runtime version: "1.4.0" tool_type: runtime executor: subprocess # delegates to subprocess primitive config: command: "python3" base_args: ["-u"] venv: enabled: true auto_create: true path: ".venv" # THIS IS NEW: defines what children we accept validation: child_schemas: - match: tool_type: script schema: type: object properties: manifest: type: object properties: language: const: python entrypoint: type: string pattern: "^[^\\0]+\\.py$" required: [entrypoint] required: [manifest] ``` --- ## Integrity System Design ### Canonical Content Hash Replace the current non-deterministic hashing with canonical JSON: ```python def compute_tool_integrity(tool_id: str, version: str, manifest: Dict, files: List[Dict]) -> str: """ Compute deterministic integrity hash for a tool version. The integrity changes if ANY execution-relevant byte changes. """ # Sort files for determinism sorted_files = sorted(files, key=lambda f: f["path"]) # Build canonical payload payload = { "tool_id": tool_id, "version": version, "manifest": manifest, # Will be sorted by json.dumps "files": [ { "path": f["path"], "sha256": f["sha256"], "is_executable": f.get("is_executable", False) } for f in sorted_files ] } # Canonical JSON: sorted keys, no extra whitespace canonical = json.dumps(payload, sort_keys=True, separators=(",", ":")) return hashlib.sha256(canonical.encode()).hexdigest() ``` ### Verification at Every Chain Step ```python class IntegrityVerifier: """Verifies tool integrity at every step of the execution chain.""" async def verify_chain(self, chain: List[Dict]) -> VerificationResult: """ Verify integrity of every tool in the chain. Args: chain: Resolved chain from leaf to primitive Returns: VerificationResult with success/failure and details """ for i, tool in enumerate(chain): # Recompute integrity from manifest + files computed = compute_tool_integrity( tool["tool_id"], tool["version"], tool["manifest"], tool.get("files", []) ) stored = tool.get("content_hash") if computed != stored: return VerificationResult( success=False, error=f"Integrity mismatch for {tool['tool_id']}@{tool['version']}: " f"computed={computed[:12]}, stored={stored[:12]}", failed_at=i ) return VerificationResult(success=True) ``` --- ## Parent→Child Validation Design ### Schema Definition in Manifest Each tool can define schemas for the children it accepts: ```yaml validation: child_schemas: - match: tool_type: script # Optional: constrain by namespace, tags, etc. schema: { ... JSON Schema ... } - match: tool_type: api schema: { ... different schema for APIs ... } ``` ### Validation Engine ```python class ChainValidator: """Validates parent→child relationships in execution chains.""" def __init__(self, schema_validator: SchemaValidator): self.schema_validator = schema_validator def validate_chain(self, chain: List[Dict]) -> ValidationResult: """ Validate each adjacent pair (parent, child) in the chain. The chain is ordered: [leaf_script, runtime, ..., primitive] So we validate: runtime validates leaf_script, etc. """ issues = [] # Iterate pairs: (child, parent) for i in range(len(chain) - 1): child = chain[i] parent = chain[i + 1] result = self._validate_pair(parent, child) if not result["valid"]: issues.extend(result["issues"]) return ValidationResult( valid=len(issues) == 0, issues=issues ) def _validate_pair(self, parent: Dict, child: Dict) -> Dict: """Validate that parent accepts child according to its schema.""" parent_manifest = parent.get("manifest", {}) validation_config = parent_manifest.get("validation", {}) child_schemas = validation_config.get("child_schemas", []) if not child_schemas: return { "valid": False, "issues": [f"Parent '{parent['tool_id']}' must define child_schemas to validate children"] } # Find matching schema for child child_type = child.get("tool_type") for schema_def in child_schemas: match_criteria = schema_def.get("match", {}) if self._matches(child, match_criteria): # Found matching schema - validate child against it schema = schema_def.get("schema") return self.schema_validator.validate(child, schema) # No matching schema found return { "valid": False, "issues": [ f"Parent '{parent['tool_id']}' has no schema matching child " f"'{child['tool_id']}' (type: {child_type})" ] } def _matches(self, child: Dict, criteria: Dict) -> bool: """Check if child matches the match criteria.""" for key, value in criteria.items(): if child.get(key) != value: return False return True ``` --- ## Lockfile Design ### Format ```json { "lockfile_version": 1, "generated_at": "2026-01-24T12:00:00Z", "root": { "tool_id": "my_data_processor", "version": "2.1.0", "integrity": "sha256:a1b2c3d4e5f6..." }, "resolved_chain": [ { "tool_id": "my_data_processor", "version": "2.1.0", "integrity": "sha256:a1b2c3d4e5f6...", "executor": "python_runtime" }, { "tool_id": "python_runtime", "version": "1.4.0", "integrity": "sha256:f6e5d4c3b2a1...", "executor": "subprocess" }, { "tool_id": "subprocess", "version": "1.0.0", "integrity": "sha256:0123456789ab...", "executor": null } ], "registry": { "url": "https://mrecfyfjpwzrzxoiooew.supabase.co", "fetched_at": "2026-01-24T12:00:00Z" } } ``` ### Usage ```python class LockfileManager: """Manages lockfile creation and consumption.""" async def freeze(self, tool_id: str, version: Optional[str] = None) -> Dict: """ Resolve and freeze a tool chain into a lockfile. """ # Resolve chain chain = await self.registry.resolve_chain(tool_id) # Verify all integrities verification = await self.verifier.verify_chain(chain) if not verification.success: raise IntegrityError(verification.error) # Validate all parent→child relationships validation = self.chain_validator.validate_chain(chain) if not validation.valid: raise ValidationError(validation.issues) # Build lockfile return { "lockfile_version": 1, "generated_at": datetime.now(timezone.utc).isoformat(), "root": { "tool_id": chain[0]["tool_id"], "version": chain[0]["version"], "integrity": chain[0]["content_hash"] }, "resolved_chain": [ { "tool_id": t["tool_id"], "version": t["version"], "integrity": t["content_hash"], "executor": t.get("executor_id") } for t in chain ], "registry": { "url": self.registry.url, "fetched_at": datetime.now(timezone.utc).isoformat() } } async def execute_with_lockfile(self, lockfile: Dict, params: Dict) -> ExecutionResult: """ Execute using a lockfile - requires exact version/integrity matches. """ for entry in lockfile["resolved_chain"]: # Fetch exact version tool = await self.registry.get(entry["tool_id"], entry["version"]) if not tool: raise LockfileError(f"Tool not found: {entry['tool_id']}@{entry['version']}") # Verify integrity matches lockfile if tool["content_hash"] != entry["integrity"]: raise IntegrityError( f"Integrity mismatch for {entry['tool_id']}: " f"lockfile={entry['integrity'][:12]}, " f"registry={tool['content_hash'][:12]}" ) # All verified - execute return await self.executor.execute(lockfile["root"]["tool_id"], params) ``` --- ## Database Changes ### tool_versions Table Updates ```sql -- Add integrity field (canonical hash including files) ALTER TABLE tool_versions ADD COLUMN integrity TEXT; -- Add child validation schemas to manifest structure -- (No schema change needed - validation goes in manifest JSONB) -- Index for integrity lookups CREATE INDEX idx_tool_versions_integrity ON tool_versions(integrity); ``` ### resolve_executor_chain RPC Update The RPC needs to return file hashes for integrity verification: ```sql CREATE OR REPLACE FUNCTION resolve_executor_chain(p_tool_id TEXT) RETURNS TABLE ( depth INT, tool_id TEXT, version TEXT, tool_type TEXT, executor_id TEXT, manifest JSONB, content_hash TEXT, integrity TEXT, file_hashes JSONB -- NEW: Array of {path, sha256} ) AS $$ WITH RECURSIVE chain AS ( -- Base case: the requested tool SELECT 0 as depth, t.tool_id, tv.version, t.tool_type, t.executor_id, tv.manifest, tv.content_hash, tv.integrity, ( SELECT jsonb_agg(jsonb_build_object( 'path', tvf.path, 'sha256', tvf.sha256, 'is_executable', tvf.is_executable )) FROM tool_version_files tvf WHERE tvf.tool_version_id = tv.id ) as file_hashes FROM tools t JOIN tool_versions tv ON t.id = tv.tool_id AND tv.is_latest = true WHERE t.tool_id = p_tool_id UNION ALL -- Recursive case: follow executor_id SELECT c.depth + 1, t.tool_id, tv.version, t.tool_type, t.executor_id, tv.manifest, tv.content_hash, tv.integrity, ( SELECT jsonb_agg(jsonb_build_object( 'path', tvf.path, 'sha256', tvf.sha256, 'is_executable', tvf.is_executable )) FROM tool_version_files tvf WHERE tvf.tool_version_id = tv.id ) as file_hashes FROM chain c JOIN tools t ON t.tool_id = c.executor_id JOIN tool_versions tv ON t.id = tv.tool_id AND tv.is_latest = true WHERE c.executor_id IS NOT NULL ) SELECT * FROM chain ORDER BY depth; $$ LANGUAGE sql; ``` --- ## Example: python_runtime → user_script Chain ### 1. The Primitive (Hardcoded) ```python # subprocess primitive - the ONLY hardcoded subprocess logic class SubprocessPrimitive: config_schema = { "type": "object", "properties": { "command": {"type": "string"}, "args": {"type": "array", "items": {"type": "string"}}, "env": {"type": "object"}, "cwd": {"type": "string"}, "timeout": {"type": "integer", "default": 300} }, "required": ["command"] } ``` ### 2. python_runtime Manifest (Data) ```yaml tool_id: python_runtime version: "1.4.0" tool_type: runtime executor: subprocess config: command: "python3" base_args: ["-u"] install_deps: true venv: enabled: true path: ".venv" # Child validation - what scripts we accept validation: child_schemas: - match: tool_type: script schema: type: object properties: tool_id: type: string pattern: "^[a-z][a-z0-9_]*$" tool_type: const: script manifest: type: object properties: language: const: python entrypoint: type: string pattern: "\\.py$" requires: type: array items: type: string required: [entrypoint] required: [tool_id, tool_type, manifest] ``` ### 3. User Script Manifest (Data) ```yaml tool_id: data_processor version: "2.1.0" tool_type: script executor: python_runtime manifest: language: python entrypoint: process_data.py requires: - pandas>=2.0 - numpy config: timeout: 1800 env: PYTHONPATH: "/app/lib" ``` ### 4. Execution Flow ``` 1. Request: execute("data_processor", {"input": "data.csv"}) 2. Resolve chain: [data_processor@2.1.0, python_runtime@1.4.0, subprocess@1.0.0] 3. Verify integrity (EVERY step): ✓ data_processor: sha256:abc... matches ✓ python_runtime: sha256:def... matches ✓ subprocess: sha256:012... matches 4. Validate parent→child (EVERY pair): ✓ python_runtime accepts data_processor (matches script schema) ✓ subprocess accepts python_runtime (matches runtime schema) 5. Merge configs (child overrides parent): { command: "python3", args: ["-u", "process_data.py"], env: {PYTHONPATH: "/app/lib"}, timeout: 1800 } 6. Execute via subprocess primitive ``` --- ## Implementation Status ### Phase 1: Canonical Integrity ✅ COMPLETE 1. ✅ **Created `integrity.py`** - `kiwi_mcp/primitives/integrity.py` 2. ✅ **Updated `ToolRegistry.publish()`** - Now uses `compute_tool_integrity()` 3. ✅ **Database migration** - `002_integrity_verification.sql` applied 4. ⏳ **Backfill** - Existing tools use old hashes (will be recomputed on next publish) ### Phase 2: Integrity Verification ✅ COMPLETE 1. ✅ **Created `IntegrityVerifier`** - `kiwi_mcp/primitives/integrity_verifier.py` 2. ✅ **Updated `resolve_executor_chain` RPC** - Returns `version`, `content_hash`, `file_hashes` 3. ✅ **Added verification to `PrimitiveExecutor.execute()`** - Configurable via `verify_integrity` flag 4. ✅ **Added tests** - 31 tests in `test_chain_validation.py` ### Phase 3: Parent→Child Validation ✅ COMPLETE 1. ✅ **Created `ChainValidator`** - `kiwi_mcp/primitives/chain_validator.py` 2. ⏳ **Add `validation.child_schemas`** to runtime manifests - Template ready, needs deployment 3. ✅ **Integrated into `PrimitiveExecutor.execute()`** - Configurable via `validate_chain` flag ### Phase 4: Lockfile Support ✅ COMPLETE 1. ✅ **Created `LockfileManager`** - `kiwi_mcp/primitives/lockfile.py` 2. ⏳ **Add `freeze` command** to tool handler - API ready, CLI integration pending 3. ✅ **Created lockfile format** - Version 1 with full chain + integrity --- ## Migration Path ### Backward Compatibility 1. **Integrity field optional initially** - fall back to content_hash if missing 2. **Child schemas optional** - allow execution if parent has no schemas (with warning) 3. **Gradual enforcement** - add schemas to runtimes one at a time ### Timeline - Week 1: Phases 1-2 (integrity system) - Week 2: Phases 3-4 (validation + lockfiles) - Week 3: Schema definitions for all builtin tools - Week 4: Strict enforcement mode (fail if no child schema) --- ## Success Criteria 1. ✅ Only subprocess + http_client contain execution code 2. ✅ All other tools are pure data (manifests in DB or files) 3. ✅ Hash verification at every chain step 4. ✅ Parent tools define child validation schemas 5. ✅ Lockfiles enable reproducible execution 6. ✅ Same input + lockfile = identical execution (determinism) --- ## Performance Analysis & Caching Strategy ### Existing Caches | Cache | Location | What It Caches | TTL | Status | | ----------------------------------- | ---------------------- | --------------------- | ----------- | --------------- | | `ChainResolver._chain_cache` | executor.py:43 | Resolved chains | ∞ (session) | ⚠️ Needs update | | `SchemaValidator._compiled_schemas` | schema_validator.py:16 | Compiled JSON schemas | ∞ (session) | ✅ Reusable | | `SchemaCache` | mcp/schema_cache.py | MCP tool schemas | 1 hour | ✅ Keep | | Embedding caches | storage/vector/ | Text embeddings | ∞ (session) | ✅ Keep | ### Performance Hot Spots #### 1. Hash Computation (NEW - Low Risk) ``` Operation: sha256 of manifest + file hashes Speed: ~400MB/s on modern CPU Typical manifest: ~2KB Time: ~0.005ms per tool Mitigation: Compute ONCE at publish, verify at execute ``` #### 2. Chain Resolution (EXISTING - Needs Update) ``` Current: ChainResolver caches resolved chains ✓ Problem: Cache doesn't include file_hashes needed for integrity Solution: Extend cache to store full chain data including: - manifest - file_hashes - content_hash/integrity - version ``` #### 3. Integrity Verification (NEW - Medium Risk) ``` Operation: Recompute integrity for each chain link Typical chain: 3 tools (script → runtime → primitive) Time per verification: ~0.01ms Mitigation: Cache verified integrities by content_hash - Key: content_hash - Value: {verified: true, verified_at: timestamp} - Skip re-verification if already verified this session ``` #### 4. Parent→Child Schema Validation (NEW - Low Risk) ``` Operation: JSON Schema validation per chain pair Typical chain: 2 validations (script↔runtime, runtime↔primitive) Time per validation: ~0.1ms (schemas are compiled and cached) Mitigation: SchemaValidator already caches compiled schemas ✓ ``` ### Updated Caching Architecture ```python class ChainResolver: """Resolves and caches executor chains with full integrity data.""" def __init__(self, registry): self.registry = registry # Existing: chain cache by tool_id self._chain_cache: Dict[str, List[Dict]] = {} # NEW: verified integrity cache # Key: content_hash, Value: verification result self._integrity_cache: Dict[str, VerificationResult] = {} # NEW: validation cache # Key: (parent_hash, child_hash), Value: validation result self._validation_cache: Dict[Tuple[str, str], ValidationResult] = {} async def resolve(self, tool_id: str) -> List[Dict]: """Resolve chain with full data for integrity checks.""" if tool_id in self._chain_cache: return self._chain_cache[tool_id] # Updated RPC returns file_hashes chain = await self.registry.resolve_chain(tool_id) self._chain_cache[tool_id] = chain return chain def is_integrity_verified(self, content_hash: str) -> bool: """Check if we've already verified this integrity.""" return content_hash in self._integrity_cache def is_pair_validated(self, parent_hash: str, child_hash: str) -> bool: """Check if we've already validated this parent→child pair.""" return (parent_hash, child_hash) in self._validation_cache ``` ### Cache Invalidation Strategy ```python class CacheInvalidation: """When to invalidate caches.""" # Chain cache: invalidate when tool is updated def on_tool_publish(self, tool_id: str): self.chain_resolver._chain_cache.pop(tool_id, None) # Also invalidate any chains that include this tool for cached_id, chain in list(self.chain_resolver._chain_cache.items()): if any(t["tool_id"] == tool_id for t in chain): self.chain_resolver._chain_cache.pop(cached_id, None) # Integrity cache: invalidate when content changes (rare) def on_tool_content_change(self, old_hash: str, new_hash: str): self.chain_resolver._integrity_cache.pop(old_hash, None) # Validation cache: invalidate when parent schema changes def on_schema_change(self, parent_hash: str): keys_to_remove = [ k for k in self._validation_cache if k[0] == parent_hash ] for k in keys_to_remove: self._validation_cache.pop(k, None) ``` ### Performance Benchmarks (Expected) | Operation | Without Cache | With Cache | Improvement | | -------------------------- | ------------- | ---------- | ----------- | | Chain resolution | 50-100ms (DB) | <1ms | 50-100x | | Integrity verification | 0.03ms | 0.001ms | 30x | | Schema validation | 0.3ms | 0.001ms | 300x | | **Total first execution** | ~100ms | ~100ms | - | | **Total cached execution** | ~100ms | <5ms | 20x | ### Memory Considerations ```python # Estimated memory usage per cached item CACHE_MEMORY = { "chain_entry": 5_000, # ~5KB per chain (3 tools with manifests) "integrity_entry": 100, # ~100 bytes (hash + timestamp) "validation_entry": 200, # ~200 bytes (result + issues) "compiled_schema": 2_000, # ~2KB per compiled schema } # For 1000 unique tools with 100 schemas: # Chain cache: 1000 * 5KB = 5MB # Integrity cache: 1000 * 100B = 100KB # Validation cache: 1000 * 200B = 200KB # Schema cache: 100 * 2KB = 200KB # TOTAL: ~5.5MB - very reasonable ``` ### Bun.js-Style Optimizations (Future) These can be added later if needed: 1. **Parallel integrity verification**: Verify all chain links concurrently 2. **Lazy file hash loading**: Only fetch file_hashes when integrity check fails 3. **Content-addressed artifact cache**: Cache tool artifacts by integrity hash on disk 4. **Incremental validation**: Only re-validate changed portions of chain --- ## Appendix: Files Created/Modified ### New Files | File | Purpose | | ------------------------------------------------ | ---------------------------------------------- | | `kiwi_mcp/primitives/integrity.py` | Canonical content-addressed hashing | | `kiwi_mcp/primitives/integrity_verifier.py` | Hash verification at every chain step | | `kiwi_mcp/primitives/chain_validator.py` | Parent→child schema validation | | `kiwi_mcp/primitives/lockfile.py` | Lockfile management for reproducible execution | | `docs/migrations/002_integrity_verification.sql` | Database migration for updated RPCs | | `tests/primitives/test_chain_validation.py` | 31 tests for the new system | ### Modified Files | File | Changes | | --------------------------------- | ------------------------------------------------------- | | `kiwi_mcp/primitives/executor.py` | Added integrity verification, chain validation, caching | | `kiwi_mcp/primitives/__init__.py` | Exports for all new modules | | `kiwi_mcp/api/tool_registry.py` | Uses `compute_tool_integrity()` in publish | ### Existing Files (Reference) | File | Purpose | | ------------------------------------ | ------------------------------------------ | | `kiwi_mcp/primitives/subprocess.py` | Hardcoded subprocess primitive (unchanged) | | `kiwi_mcp/primitives/http_client.py` | Hardcoded HTTP primitive (unchanged) | | `kiwi_mcp/utils/validators.py` | ToolValidator - Layer 1 validation | | `kiwi_mcp/utils/schema_validator.py` | JSON Schema engine |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/leolilley/kiwi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

TOOL_CHAIN_VALIDATION_DESIGN.md•28.2 KiB