Skip to main content
Glama
code-execution-interface.md10.4 kB
# Code Execution Interface API Documentation ## Overview The Code Execution Interface provides a token-efficient Python API for direct memory operations, achieving 85-95% token reduction compared to MCP tool calls. **Version:** 1.0.0 **Status:** Phase 1 (Core Operations) **Issue:** [#206](https://github.com/doobidoo/mcp-memory-service/issues/206) ## Token Efficiency | Operation | MCP Tools | Code Execution | Reduction | |-----------|-----------|----------------|-----------| | search(5 results) | ~2,625 tokens | ~385 tokens | **85%** | | store() | ~150 tokens | ~15 tokens | **90%** | | health() | ~125 tokens | ~20 tokens | **84%** | **Annual Savings (Conservative):** - 10 users x 5 sessions/day x 365 days x 6,000 tokens = 109.5M tokens/year - At $0.15/1M tokens: **$16.43/year saved** per 10-user deployment ## Installation The API is included in mcp-memory-service v8.18.2+. No additional installation required. ```bash # Ensure you have the latest version pip install --upgrade mcp-memory-service ``` ## Quick Start ```python from mcp_memory_service.api import search, store, health # Store a memory (15 tokens) hash = store("Implemented OAuth 2.1 authentication", tags=["auth", "feature"]) print(f"Stored: {hash}") # Output: Stored: abc12345 # Search memories (385 tokens for 5 results) results = search("authentication", limit=5) print(f"Found {results.total} memories") for m in results.memories: print(f" {m.hash}: {m.preview[:50]}... (score: {m.score:.2f})") # Health check (20 tokens) info = health() print(f"Backend: {info.backend}, Status: {info.status}, Count: {info.count}") ``` ## API Reference ### Core Operations #### search() Semantic search with compact results. ```python def search( query: str, limit: int = 5, tags: Optional[List[str]] = None ) -> CompactSearchResult: """ Search memories using semantic similarity. Args: query: Search query text (natural language) limit: Maximum results to return (default: 5) tags: Optional list of tags to filter results Returns: CompactSearchResult with memories, total count, and query Raises: ValueError: If query is empty or limit is invalid RuntimeError: If storage backend unavailable Token Cost: ~25 tokens + ~73 tokens per result Example: >>> results = search("recent architecture decisions", limit=3) >>> for m in results.memories: ... print(f"{m.hash}: {m.preview}") """ ``` **Performance:** - First call: ~50ms (includes storage initialization) - Subsequent calls: ~5-10ms (connection reused) #### store() Store a new memory. ```python def store( content: str, tags: Optional[Union[str, List[str]]] = None, memory_type: str = "note" ) -> str: """ Store a new memory. Args: content: Memory content text tags: Single tag or list of tags (optional) memory_type: Memory type classification (default: "note") Returns: 8-character content hash Raises: ValueError: If content is empty RuntimeError: If storage operation fails Token Cost: ~15 tokens Example: >>> hash = store( ... "Fixed authentication bug", ... tags=["bug", "auth"], ... memory_type="fix" ... ) >>> print(f"Stored: {hash}") Stored: abc12345 """ ``` **Performance:** - First call: ~50ms (includes storage initialization) - Subsequent calls: ~10-20ms (includes embedding generation) #### health() Service health and status check. ```python def health() -> CompactHealthInfo: """ Get service health and status. Returns: CompactHealthInfo with status, count, and backend Token Cost: ~20 tokens Example: >>> info = health() >>> if info.status == 'healthy': ... print(f"{info.count} memories in {info.backend}") """ ``` **Performance:** - First call: ~50ms (includes storage initialization) - Subsequent calls: ~5ms (cached stats) ### Data Types #### CompactMemory Minimal memory representation (91% token reduction). ```python class CompactMemory(NamedTuple): hash: str # 8-character content hash preview: str # First 200 characters tags: tuple[str] # Immutable tags tuple created: float # Unix timestamp score: float # Relevance score (0.0-1.0) ``` **Token Cost:** ~73 tokens (vs ~820 for full Memory object) #### CompactSearchResult Search result container. ```python class CompactSearchResult(NamedTuple): memories: tuple[CompactMemory] # Immutable results total: int # Total results count query: str # Original query def __repr__(self) -> str: return f"SearchResult(found={self.total}, shown={len(self.memories)})" ``` **Token Cost:** ~10 tokens + (73 x num_memories) #### CompactHealthInfo Service health information. ```python class CompactHealthInfo(NamedTuple): status: str # 'healthy' | 'degraded' | 'error' count: int # Total memories backend: str # 'sqlite_vec' | 'cloudflare' | 'hybrid' ``` **Token Cost:** ~20 tokens ## Usage Examples ### Basic Search ```python from mcp_memory_service.api import search # Simple search results = search("authentication", limit=5) print(f"Found {results.total} memories") # Search with tag filter results = search("database", limit=10, tags=["architecture"]) for m in results.memories: if m.score > 0.7: # High relevance only print(f"{m.hash}: {m.preview}") ``` ### Batch Store ```python from mcp_memory_service.api import store # Store multiple memories changes = [ "Implemented OAuth 2.1 authentication", "Added JWT token validation", "Fixed session timeout bug" ] for change in changes: hash_val = store(change, tags=["changelog", "auth"]) print(f"Stored: {hash_val}") ``` ### Health Monitoring ```python from mcp_memory_service.api import health info = health() if info.status != 'healthy': print(f"⚠️ Service degraded: {info.status}") print(f"Backend: {info.backend}") print(f"Memory count: {info.count}") else: print(f"✅ Service healthy ({info.count} memories in {info.backend})") ``` ### Error Handling ```python from mcp_memory_service.api import search, store try: # Store with validation if not content.strip(): raise ValueError("Content cannot be empty") hash_val = store(content, tags=["test"]) # Search with error handling results = search("query", limit=5) if results.total == 0: print("No results found") else: for m in results.memories: print(f"{m.hash}: {m.preview}") except ValueError as e: print(f"Validation error: {e}") except RuntimeError as e: print(f"Storage error: {e}") ``` ## Performance Optimization ### Connection Reuse The API automatically reuses storage connections for optimal performance: ```python from mcp_memory_service.api import search, store # First call: ~50ms (initialization) store("First memory", tags=["test"]) # Subsequent calls: ~10ms (reuses connection) store("Second memory", tags=["test"]) store("Third memory", tags=["test"]) # Search also reuses connection: ~5ms results = search("test", limit=5) ``` ### Limit Result Count ```python # For quick checks, use small limits results = search("query", limit=3) # ~240 tokens # For comprehensive results, use larger limits results = search("query", limit=20) # ~1,470 tokens ``` ## Backward Compatibility The Code Execution API works alongside existing MCP tools without breaking changes: - **MCP tools continue working** - No deprecation or removal - **Gradual migration** - Adopt code execution incrementally - **Fallback mechanism** - Tools available if code execution fails ## Migration Guide ### From MCP Tools to Code Execution **Before (MCP Tool):** ```javascript // Node.js hook using MCP client const result = await mcpClient.callTool('retrieve_memory', { query: 'architecture', limit: 5, similarity_threshold: 0.7 }); // Result: ~2,625 tokens ``` **After (Code Execution):** ```python # Python code in hook from mcp_memory_service.api import search results = search('architecture', limit=5) # Result: ~385 tokens (85% reduction) ``` ## Troubleshooting ### Storage Initialization Errors ```python from mcp_memory_service.api import health info = health() if info.status == 'error': print(f"Storage backend {info.backend} not available") # Check configuration: # - DATABASE_PATH set correctly # - Storage backend initialized # - Permissions on database directory ``` ### Import Errors ```bash # Ensure mcp-memory-service is installed pip list | grep mcp-memory-service # Verify version (requires 8.18.2+) python -c "import mcp_memory_service; print(mcp_memory_service.__version__)" ``` ### Performance Issues ```python import time from mcp_memory_service.api import search # Measure performance start = time.perf_counter() results = search("query", limit=5) duration_ms = (time.perf_counter() - start) * 1000 if duration_ms > 100: print(f"⚠️ Slow search: {duration_ms:.1f}ms (expected: <50ms)") # Possible causes: # - Cold start (first call after initialization) # - Large database requiring optimization # - Embedding model not cached ``` ## Future Enhancements (Roadmap) ### Phase 2: Extended Operations - `search_by_tag()` - Tag-based filtering - `recall()` - Natural language time queries - `delete()` - Delete by content hash - `update()` - Update memory metadata ### Phase 3: Advanced Features - `store_batch()` - Batch store operations - `search_iter()` - Streaming search results - Document ingestion API - Memory consolidation triggers ## Related Documentation - [Research Document](/docs/research/code-execution-interface-implementation.md) - [Implementation Summary](/docs/research/code-execution-interface-summary.md) - [Issue #206](https://github.com/doobidoo/mcp-memory-service/issues/206) - [CLAUDE.md](/CLAUDE.md) - Project instructions ## Support For issues, questions, or contributions: - GitHub Issues: https://github.com/doobidoo/mcp-memory-service/issues - Documentation: https://github.com/doobidoo/mcp-memory-service/wiki ## License Copyright 2024 Heinrich Krupp Licensed under the Apache License, Version 2.0

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/doobidoo/mcp-memory-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server