Skip to main content
Glama
caching.md9.97 kB
# YNAB Local Repository with Differential Sync ## Overview A local-first repository pattern that serves YNAB data from memory while using background differential sync to maintain consistency. The repository mirrors YNAB data locally, enabling instant reads without API latency. ## Core Design Principles 1. **Local-First**: All reads from in-memory repository, zero API calls during normal operation 2. **Differential Sync**: Use YNAB's `server_knowledge` to fetch only changes, not full datasets 3. **Repository Pattern**: Speaks only YNAB SDK models, no MCP-specific types 4. **Single Budget**: Server operates on one budget specified by `YNAB_BUDGET` env var 5. **Background Sync**: Updates happen out-of-band, never blocking MCP tool calls ## Key Constraints - **YNAB is source of truth**: Local repository is read-only mirror, never modifies data - **Eventually consistent**: Repository converges to YNAB state within sync interval - **Handle stale knowledge**: When server returns 409, fall back to full refresh - **Thread-safe**: Multiple MCP tools can read concurrently during sync - **Memory-only initially**: Start with dicts, defer persistence to later ## Repository Interface ```python class YNABRepository: """Local repository for YNAB data with background differential sync.""" def __init__(self, budget_id: str, api_client_factory: Callable): self.budget_id = budget_id # Set once from YNAB_BUDGET env var self.api_client_factory = api_client_factory # In-memory storage - simple dicts self._data: dict[str, list] = {} # entity_type -> list of entities self._server_knowledge: dict[str, int] = {} # entity_type -> server_knowledge self._lock = threading.RLock() self._last_sync: datetime | None = None # Data access - returns YNAB SDK models directly def get_accounts(self) -> list[ynab.Account]: def get_categories(self) -> list[ynab.CategoryGroupWithCategories]: def get_transactions(self, since_date: date | None = None) -> list[ynab.TransactionDetail]: def get_payees(self) -> list[ynab.Payee]: def get_budget_month(self, month: date) -> ynab.MonthDetail: # Sync management def sync(self) -> None: # Fetch deltas and update repository def needs_sync(self) -> bool: # Check if sync is needed def last_sync_time(self) -> datetime | None: ``` ## How Differential Sync Works ### Initial Load ```python # First call without last_knowledge_of_server response = api.get_accounts(budget_id) # Returns: all accounts + server_knowledge: 100 self._data["accounts"] = response.data.accounts self._server_knowledge["accounts"] = response.data.server_knowledge ``` ### Delta Sync ```python # Subsequent calls with last_knowledge_of_server response = api.get_accounts(budget_id, last_knowledge_of_server=100) # Returns: only changed accounts + server_knowledge: 101 # Apply changes: add/update/remove based on response ``` ### Applying Deltas ```python def apply_deltas(current: list, deltas: list) -> list: entity_map = {e.id: e for e in current} for delta in deltas: if delta.deleted: entity_map.pop(delta.id, None) else: entity_map[delta.id] = delta # Add or update return list(entity_map.values()) ``` ## Budget Configuration Change ### Environment Variables - **OLD**: `YNAB_DEFAULT_BUDGET` (optional, with fallback logic) - **NEW**: `YNAB_BUDGET` (required for server startup) The server will fail to start if `YNAB_BUDGET` is not set, making configuration explicit and removing ambiguity. ### Tool Signature Simplification Remove `budget_id` parameter from all MCP tools since the server operates on a single budget: ```python # Before @mcp.tool() def list_accounts(budget_id: str | None = None, limit: int = 100, offset: int = 0): budget_id = budget_id_or_default(budget_id) ... # After @mcp.tool() def list_accounts(limit: int = 100, offset: int = 0): # No budget_id needed - using server's configured budget ... ``` This change applies to all tools: `list_accounts`, `list_categories`, `list_transactions`, `list_payees`, `get_budget_month`, etc. Additionally, the `list_budgets` tool becomes unnecessary and should be removed since the server operates on a single configured budget. ### MCP Instructions Update Simplify the MCP server instructions to remove budget_id complexity: ```python mcp = FastMCP[None]( name="YNAB", instructions=""" Access to your YNAB budget data including accounts, categories, and transactions. The server operates on the budget configured via YNAB_BUDGET environment variable. All data is served from a local repository that syncs with YNAB in the background. """ ) ``` ## Integration with MCP Tools ### Current Pattern (Direct API with budget_id) ```python @mcp.tool() def list_accounts(budget_id: str | None = None): budget_id = budget_id_or_default(budget_id) with get_ynab_client() as api_client: accounts_api = ynab.AccountsApi(api_client) response = accounts_api.get_accounts(budget_id) # Process and return ``` ### New Pattern (Repository without budget_id) ```python # Global repository instance for the configured budget _repository: YNABRepository | None = None def get_repository() -> YNABRepository: global _repository if _repository is None: budget_id = os.environ["YNAB_BUDGET"] # Required at startup _repository = YNABRepository( budget_id=budget_id, api_client_factory=get_ynab_client ) # Initial sync to populate _repository.sync() return _repository @mcp.tool() def list_accounts(limit: int = 100, offset: int = 0): # No budget_id parameter repo = get_repository() # Trigger background sync if needed (non-blocking) if repo.needs_sync(): threading.Thread(target=repo.sync).start() # Return data instantly from repository accounts = repo.get_accounts() # Apply existing filtering/pagination return process_accounts(accounts) ``` ## Critical Implementation Details ### Entity Types to Sync - `accounts` - All accounts in budget - `categories` - Category groups with nested categories - `transactions` - Transaction history (consider date limits) - `payees` - All payees - `scheduled_transactions` - Scheduled/recurring transactions - `budget_months` - Month-specific budget data (current/last/next) ### Thread Safety - Use `threading.RLock()` for all repository data access - Sync updates entire entity list atomically - Reads can happen during sync (old data until sync completes) ### Error Handling - **Network failure**: Continue serving stale data, retry sync later - **409 (stale knowledge)**: Clear entity type, fetch all without last_knowledge - **429 (rate limit)**: YNAB allows 200 requests/hour per token. Use exponential backoff, track request count - **Invalid token**: Fail gracefully, log error, serve cached data ### Memory Management - Typical budget: ~1-5MB in memory - Consider transaction date limits (e.g., last 2 years only) - Clear old budget month data (keep current + last + next) ## Benefits - **Performance**: Sub-millisecond reads vs 100-500ms API calls - **Reliability**: Works offline, degrades gracefully - **Efficiency**: 60-80% fewer API calls after initial sync - **User Experience**: Instant responses in MCP tools ## Future Considerations - **Persistence**: SQLite for data survival across restarts - **Selective Sync**: Only sync entity types actually used - **Smart Scheduling**: Sync more frequently during business hours - **Multi-Budget**: Support switching between budgets efficiently ## Migration Notes ### ✅ Completed Breaking Changes 1. **Environment variable**: `YNAB_DEFAULT_BUDGET` → `YNAB_BUDGET` (now required) ✅ 2. **Tool signatures**: Remove `budget_id` parameter from all tools ✅ 3. **Tool removal**: Delete `list_budgets` tool entirely ✅ 4. **Error handling**: Server fails to start without `YNAB_BUDGET` ✅ 5. **Test infrastructure**: Updated with pytest-env for environment variable support ✅ ### User Impact - Users must set `YNAB_BUDGET` before starting the server ✅ - LLMs no longer need to handle budget selection logic ✅ - Simpler, cleaner tool interfaces without optional budget_id parameters ✅ ### Implementation Status - **Phase 0: Budget ID Removal** ✅ COMPLETED - All 57 tests passing with 100% coverage - Clean foundation ready for repository pattern implementation - **Phase 1: Repository Pattern** ✅ COMPLETED - ✅ YNABRepository class created with differential sync - ✅ Thread-safe data access with RLock - ✅ Delta application for add/update/delete operations - ✅ Lazy initialization per entity type - ✅ Server integration - all tools use repository - ✅ Background sync (non-blocking, triggered when data is stale) - ✅ needs_sync() method for staleness detection - ✅ Proper error handling (ConflictException, 429 rate limiting, fallback) - ✅ Initial population at server startup - ✅ Python logging with structured error handling - ✅ Test coverage migration (all 97 tests passing with 100% coverage) - **Phase 2: Test Quality Improvements** ✅ COMPLETED - ✅ Hoisted all inline imports to top of test files - ✅ Removed unhelpful comments that just repeated code - ✅ Fixed poor test patterns (replaced try/except: pass with pytest.raises) - ✅ Consolidated duplicate test helper functions into conftest.py - ✅ Eliminated code duplication across 8+ test files - ✅ Maintained 100% test coverage throughout cleanup ## Success Criteria 1. MCP tools never wait for API calls during normal operation 2. Repository stays synchronized within 5 minutes of YNAB changes 3. All existing MCP tool functionality works unchanged (except budget_id removal) 4. Memory usage stays under 10MB for typical budgets 5. Graceful degradation when YNAB API is unavailable

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chrisguidry/you-need-an-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server