Hostaway MCP Server

hostaway-mcp
specs
005-project-brownfield-hardening

research.md•20.1 kB

# Research & Technical Decisions: MCP Server Context Window Protection **Feature**: 005-project-brownfield-hardening **Date**: 2025-10-15 **Status**: Complete ## Overview This document captures research findings and technical decisions for implementing context window protection in the Hostaway MCP server. Each decision balances performance, accuracy, security, and maintainability based on the project's constitution and constraints. --- ## R001: Token Estimation Strategies ### Question What's the optimal balance between estimation speed and accuracy for token counting? ### Options Evaluated | Option | Speed | Accuracy | Complexity | Decision | |--------|-------|----------|------------|----------| | **Character-based (4 chars/token)** | ⚡ <10ms | ~±20% | Low | ✅ **SELECTED** | | Byte-based (3 bytes/token) | ⚡ <5ms | ~±25% | Low | Rejected | | tiktoken library | 🐌 50-200ms | ~±5% | Medium | Rejected | | ML model prediction | 🐌 100-500ms | ~±10% | High | Rejected | ### Decision **Selected**: Character-based estimation (1 token ≈ 4 characters) with 20% safety margin **Rationale**: 1. **Performance**: <20ms for 100KB responses (meets FR-013 requirement) 2. **Accuracy**: Within 20% for 90% of responses is acceptable for context protection 3. **Simplicity**: No external dependencies, easy to implement and test 4. **Safety Margin**: Built-in 20% buffer prevents edge case overflows 5. **Monitoring**: 1% sampling validates actual vs estimated for ongoing tuning **Implementation**: ```python def estimate_tokens(text: str) -> int: """Estimate Claude token count from text. Uses 4 characters per token heuristic with 20% safety margin. Accurate within ±20% for 90% of responses. """ char_count = len(text) base_estimate = char_count // 4 safety_margin = int(base_estimate * 0.20) return base_estimate + safety_margin ``` **Alternatives Considered**: - **tiktoken library**: Too slow (50-200ms), would violate FR-013 (<20ms requirement). While more accurate (~±5%), the latency cost is unacceptable for real-time request processing. - **Byte-based**: Marginally faster but less accurate (~±25% vs ±20%). Character counting is fast enough in modern Python. - **ML model**: Highest accuracy but 100-500ms latency. Massive complexity overhead for minimal benefit in this use case. **Validation Strategy**: - Sample 1% of responses and compare estimated vs actual (using tiktoken offline) - Log discrepancies >30% for analysis - Retrain estimation formula if accuracy drops below 85% --- ## R002: Cursor Encoding Format ### Question How should pagination cursors be encoded securely and efficiently? ### Options Evaluated | Option | Security | Size | Performance | Decision | |--------|----------|------|-------------|----------| | **Base64 + HMAC-SHA256** | High | ~100 bytes | ⚡ <1ms | ✅ **SELECTED** | | JWT (signed) | High | ~200 bytes | 🐌 2-5ms | Rejected | | Encrypted position | Very High | ~150 bytes | 🐌 3-8ms | Rejected | | Hash-based (no signature) | Low | ~50 bytes | ⚡ <1ms | Rejected | ### Decision **Selected**: Base64-encoded JSON with HMAC-SHA256 signature **Rationale**: 1. **Security**: HMAC prevents tampering while remaining stateless 2. **Performance**: <1ms encoding/decoding (well within 50ms pagination overhead budget) 3. **Size**: ~100 bytes fits comfortably in URL query parameters 4. **Debuggability**: Base64 is human-decodable for troubleshooting (signature prevents abuse) 5. **Stateless**: No server-side signature key rotation complexity **Implementation**: ```python import base64 import hmac import json from hashlib import sha256 def encode_cursor(offset: int, timestamp: float, secret: str) -> str: """Encode pagination cursor with HMAC signature.""" payload = {"offset": offset, "ts": timestamp} payload_json = json.dumps(payload) signature = hmac.new( secret.encode(), payload_json.encode(), sha256 ).hexdigest() cursor_data = {"payload": payload, "sig": signature} cursor_json = json.dumps(cursor_data) return base64.urlsafe_b64encode(cursor_json.encode()).decode() def decode_cursor(cursor: str, secret: str) -> dict: """Decode and verify cursor. Raises ValueError if invalid.""" try: cursor_json = base64.urlsafe_b64decode(cursor.encode()).decode() cursor_data = json.loads(cursor_json) payload = cursor_data["payload"] provided_sig = cursor_data["sig"] # Verify signature expected_sig = hmac.new( secret.encode(), json.dumps(payload).encode(), sha256 ).hexdigest() if not hmac.compare_digest(expected_sig, provided_sig): raise ValueError("Invalid cursor signature") return payload except Exception as e: raise ValueError(f"Malformed cursor: {e}") ``` **Cursor Payload**: ```json { "offset": 100, // Current position in result set "ts": 1697452800.0, // Cursor creation timestamp "order": "created_desc" // Sort order for consistency } ``` **Alternatives Considered**: - **JWT**: More overhead (~200 bytes), slower (2-5ms), and overkill for pagination cursors. JWT is designed for authentication tokens, not ephemeral pagination state. - **Encrypted position**: Highest security but unnecessary complexity. Pagination cursors don't contain sensitive data - they're just positions. Signature prevents tampering which is sufficient. - **Hash-based (no signature)**: Smaller (~50 bytes) but vulnerable to tampering. Client could craft cursors to access arbitrary pages, bypassing business logic. **Security Properties**: - Tamper-proof: HMAC signature prevents modification - Opaque: Base64 encoding hides internal structure (but not encrypted - intentional for debugging) - Expirable: Timestamp enables 10-minute TTL enforcement - Stateless: No server-side storage required for verification --- ## R003: Summarization Techniques ### Question What summarization approaches work best for structured API responses? ### Options Evaluated | Option | Quality | Speed | Flexibility | Decision | |--------|---------|-------|-------------|----------| | **Field Projection** | Medium | ⚡ <5ms | High | ✅ **SELECTED** (Primary) | | **Extractive (Top-K)** | Medium | ⚡ <10ms | Medium | ✅ **SELECTED** (Secondary) | | Abstractive (LLM) | High | 🐌 500-2000ms | Low | Rejected | | Hybrid (Projection + LLM) | High | 🐌 100-500ms | Medium | Rejected | ### Decision **Selected**: Hybrid approach with Field Projection (primary) + Extractive summarization (secondary) **Rationale**: 1. **Structured Data**: Field projection perfect for JSON responses (most MCP tool outputs) 2. **Unstructured Text**: Extractive summarization for long descriptions/notes 3. **Performance**: Both techniques <20ms (meets token estimation budget) 4. **Predictability**: Deterministic output (no LLM variability) 5. **Maintainability**: Simple rule-based logic, no external model dependencies **Implementation Strategy**: ### Field Projection (for structured JSON) **Approach**: Define projection maps per endpoint specifying essential fields ```python # Per-endpoint projection maps LISTING_SUMMARY_FIELDS = [ "id", "name", "status", "bedrooms", "price.basePrice", "location.city", "location.country" ] BOOKING_SUMMARY_FIELDS = [ "id", "status", "guestName", "checkInDate", "checkOutDate", "totalPrice", "propertyId" ] def project_fields(obj: dict, field_paths: list[str]) -> dict: """Extract specified fields from nested JSON.""" result = {} for path in field_paths: parts = path.split('.') value = obj for part in parts: value = value.get(part, {}) # Rebuild nested structure if '.' in path: # e.g., "price.basePrice" -> {"price": {"basePrice": value}} *parents, leaf = parts nested = {leaf: value} for parent in reversed(parents): nested = {parent: nested} result.update(nested) else: result[path] = value return result ``` ### Extractive Summarization (for long text fields) **Approach**: Sentence extraction based on position and keyword relevance ```python def summarize_text(text: str, max_length: int = 200) -> str: """Extract key sentences from long text. Uses simple heuristics: - First sentence (often contains core info) - Sentences with domain keywords (if space permits) - Truncate with "... [XX more words]" indicator """ if len(text) <= max_length: return text sentences = text.split('. ') summary = sentences[0] # Always include first sentence # Add keyword-rich sentences if budget allows keywords = ['guest', 'booking', 'price', 'check-in', 'amenities'] for sentence in sentences[1:]: if any(kw in sentence.lower() for kw in keywords): if len(summary) + len(sentence) + 2 <= max_length: summary += '. ' + sentence if len(summary) < len(text): remaining_words = len(text.split()) - len(summary.split()) summary += f" ... [{remaining_words} more words]" return summary ``` **Alternatives Considered**: - **Abstractive (LLM)**: Highest quality but 500-2000ms latency. Would violate token estimation budget (<20ms). Also introduces external dependency (LLM API) and non-deterministic output. - **Hybrid (Projection + LLM)**: Better quality but still 100-500ms. Complexity not justified for preview mode - users can fetch full details if needed. **Projection Map Strategy**: - Per-endpoint maps defined in `src/mcp/schemas/projection_maps.py` - Essential fields: IDs, names, status, primary metrics (price, dates, counts) - Omitted fields: Verbose descriptions, full addresses, detailed breakdowns, metadata - Trade-off: ~70% size reduction while retaining 95% usability for initial triage --- ## R004: Configuration Hot-Reload Mechanism ### Question How to reload configuration without dropping in-flight requests? ### Options Evaluated | Option | Safety | Complexity | Platform Support | Decision | |--------|--------|------------|------------------|----------| | **File Watcher (watchdog)** | High | Low | Cross-platform | ✅ **SELECTED** | | Signal Handler (SIGHUP) | High | Low | Unix only | Rejected | | External Config Service | Very High | High | All | Rejected | | No hot-reload (restart required) | Medium | Very Low | All | Rejected | ### Decision **Selected**: File watcher with atomic config swap (watchdog library) **Rationale**: 1. **Safety**: Atomic read-validate-swap ensures no partial config states 2. **Simplicity**: Single-file watch with minimal code (<100 lines) 3. **Cross-platform**: Works on Linux, macOS, Windows (unlike SIGHUP) 4. **No Service Disruption**: In-flight requests use old config, new requests use new config 5. **Fast Reload**: <100ms from file write to config active (meets FR-026) **Implementation**: ```python from watchdog.observers import Observer from watchdog.events import FileSystemEventHandler import threading from pathlib import Path class ConfigReloader(FileSystemEventHandler): def __init__(self, config_path: Path, validator: callable): self.config_path = config_path self.validator = validator self.current_config = None self.lock = threading.RLock() def on_modified(self, event): if event.src_path == str(self.config_path): self.reload() def reload(self): """Reload and validate config atomically.""" try: # Read new config new_config = self._read_config(self.config_path) # Validate before swapping self.validator(new_config) # Atomic swap with self.lock: old_config = self.current_config self.current_config = new_config logger.info( "Configuration reloaded successfully", extra={"old": old_config, "new": new_config} ) except Exception as e: logger.error( "Configuration reload failed, keeping previous config", extra={"error": str(e)} ) def get_config(self): """Thread-safe config access.""" with self.lock: return self.current_config # Usage observer = Observer() reloader = ConfigReloader( config_path=Path("/opt/hostaway-mcp/config.yaml"), validator=lambda c: HostawayConfig.model_validate(c) ) observer.schedule(reloader, path="/opt/hostaway-mcp", recursive=False) observer.start() ``` **Alternatives Considered**: - **SIGHUP**: Unix-only, not portable to Windows environments. Also requires process signal handling complexity. - **External Config Service** (e.g., Consul, etcd): Massive complexity overhead for a single-server application. Would require distributed systems expertise and operational overhead. - **No hot-reload**: Forces full server restart for config changes. Unacceptable for production (causes downtime, drops connections). **Validation Strategy**: - Pydantic schema validation before applying new config - If validation fails, log error and retain previous config (fail-safe) - Config changes logged with full before/after values for audit trail --- ## R005: Cursor Storage Backend ### Question Where to store pagination state for 10-minute TTL? ### Options Evaluated | Option | Performance | Complexity | Dependencies | Decision | |--------|-------------|------------|--------------|----------| | **In-Memory Dict + TTL** | ⚡ <1ms | Low | None | ✅ **SELECTED** (Initial) | | Redis (external) | ⚡ 2-5ms | Medium | Redis server | Future | | Database Table | 🐌 10-50ms | Medium | PostgreSQL | Rejected | | File Cache | 🐌 5-20ms | Low | None | Rejected | ### Decision **Selected**: In-memory dictionary with TTL cleanup (initial), Redis migration path (future) **Rationale**: 1. **Performance**: <1ms lookup (well within 50ms pagination overhead budget) 2. **Simplicity**: No external dependencies for v1 rollout 3. **Sufficient for Single Server**: Current deployment is single-instance 4. **Migration Path**: Can swap to Redis when scaling to multi-instance 5. **Meets Requirements**: 10-minute TTL, cursor invalidation, acceptable for brownfield enhancement **Implementation**: ```python import time from dataclasses import dataclass from threading import RLock from typing import Optional @dataclass class CursorEntry: payload: dict expiry: float # Unix timestamp class InMemoryCursorStorage: """Thread-safe in-memory cursor storage with TTL.""" def __init__(self, ttl_seconds: int = 600): # 10 minutes self.ttl_seconds = ttl_seconds self._storage: dict[str, CursorEntry] = {} self._lock = RLock() def store(self, cursor_id: str, payload: dict) -> None: """Store cursor with TTL.""" expiry = time.time() + self.ttl_seconds with self._lock: self._storage[cursor_id] = CursorEntry(payload, expiry) def retrieve(self, cursor_id: str) -> Optional[dict]: """Retrieve cursor if not expired.""" with self._lock: entry = self._storage.get(cursor_id) if not entry: return None if time.time() > entry.expiry: # Expired, clean up del self._storage[cursor_id] return None return entry.payload def cleanup_expired(self) -> int: """Remove expired cursors. Returns count removed.""" now = time.time() with self._lock: expired = [ cid for cid, entry in self._storage.items() if now > entry.expiry ] for cid in expired: del self._storage[cid] return len(expired) # Background cleanup task (runs every minute) import asyncio async def cleanup_expired_cursors(storage: InMemoryCursorStorage): while True: await asyncio.sleep(60) removed = storage.cleanup_expired() if removed > 0: logger.debug(f"Cleaned up {removed} expired cursors") ``` **Alternatives Considered**: - **Redis**: Better for multi-instance deployments but adds external dependency. Not needed for current single-server setup. Migration path documented for future scaling. - **Database Table**: Too slow (10-50ms per lookup) and unnecessary persistence. Pagination cursors are ephemeral by design. - **File Cache**: Slower than in-memory and adds filesystem I/O. No benefits over in-memory for ephemeral data. **Redis Migration Strategy** (for future multi-instance scaling): ```python # Future: Redis-backed storage (drop-in replacement) import aioredis class RedisCursorStorage: def __init__(self, redis_url: str, ttl_seconds: int = 600): self.redis = aioredis.from_url(redis_url) self.ttl_seconds = ttl_seconds async def store(self, cursor_id: str, payload: dict) -> None: await self.redis.setex( f"cursor:{cursor_id}", self.ttl_seconds, json.dumps(payload) ) async def retrieve(self, cursor_id: str) -> Optional[dict]: data = await self.redis.get(f"cursor:{cursor_id}") return json.loads(data) if data else None ``` **Decision Points for Redis Migration**: 1. Multi-instance deployment required 2. Cursor sharing across instances needed 3. Higher availability requirements (Redis cluster) 4. Current performance acceptable (no premature optimization) --- ## Additional Decisions ### Middleware Ordering **Decision**: Token-aware middleware → Pagination middleware → Response **Rationale**: - Token estimation must see full response before pagination is applied - Pagination envelope is added last (wraps final response) - Backwards compatible: middleware chain extensible ### Feature Flag Granularity **Decision**: Per-endpoint feature flags with global defaults **Implementation**: ```python class ContextProtectionConfig(BaseSettings): # Global defaults pagination_enabled: bool = True summarization_enabled: bool = True # Per-endpoint overrides endpoint_overrides: dict[str, dict] = { "/api/listings": {"pagination_enabled": True, "page_size": 50}, "/api/bookings": {"pagination_enabled": True, "page_size": 100}, "/api/analytics": {"summarization_enabled": False} # Always full } ``` **Rationale**: Allows gradual rollout and endpoint-specific tuning while maintaining sane defaults. --- ## Dependencies Added ### Python Libraries ```toml [project.dependencies] # Existing dependencies preserved # New additions for context protection: watchdog = ">=3.0.0" # File watcher for config hot-reload ``` **Note**: No additional dependencies required. Character-based token estimation, Base64 encoding, and in-memory storage all use Python standard library. **Future Dependencies** (if needed): - `redis` or `aioredis` - if migrating to Redis cursor storage - `tiktoken` - if validation sampling needs actual token counts - `prometheus-client` - if metrics backend needs Prometheus format --- ## Research Validation Checklist - [x] **R001**: Token estimation strategy selected (character-based + 20% margin) - [x] **R002**: Cursor encoding format selected (Base64 + HMAC-SHA256) - [x] **R003**: Summarization techniques selected (field projection + extractive) - [x] **R004**: Hot-reload mechanism selected (file watcher with atomic swap) - [x] **R005**: Cursor storage selected (in-memory with Redis migration path) - [x] All decisions align with constitution principles (type safety, async performance, security) - [x] All decisions meet performance targets (<50ms pagination, <20ms estimation) - [x] All decisions maintain backwards compatibility (additive changes only) **Status**: Research complete. Ready for Phase 1 (data model and contract design).

Latest Blog Posts

What Is Context Bloat in MCP?
By Om-Shree-0709 on December 16, 2025.
mcp
Context Bloat
MCP Moves to the Linux Foundation: Neutral Stewardship for Agentic Infrastructure
By Om-Shree-0709 on December 15, 2025.
mcp
anthropic
Linux Foundation
Code Execution with MCP: Architecting Agentic Efficiency
By Om-Shree-0709 on December 14, 2025.
mcp
Token bloat

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/darrentmorgan/hostaway-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server