Skip to main content
Glama
WS_Daemon_Skeleton.md3.27 kB
# WebSocket Daemon + stdio Shim (MCP) Date: 2025-09-09 Owner: EX-AI-MCP-Server ## Goals - Support multiple concurrent applications connecting to MCP without stdio contention - Keep VS Code Augment compatibility (stdio from the client’s perspective) - Provide a shared, long-lived daemon to reduce RAM and keep caches warm - Maintain strong local isolation and simple auth ## Scope (skeleton now, iterative hardening later) - Loopback WebSocket daemon (ws://127.0.0.1:<port>) that multiplexes sessions - Stdio shim that bridges JSON-RPC over stdio <-> WebSocket for VS Code - Session manager with per-session quotas, logging, and graceful shutdown - Health endpoint and minimal metrics (JSONL) in ./logs ## Transport & Protocol - WebSocket text frames carrying JSON-RPC 2.0 messages (newline-safe JSON) - Each client attaches an initial handshake: {type:"hello", session_id, auth_token?} - The daemon tags/isolates per session_id; forwards tool calls to server core - Backpressure: per-session queue limits (e.g., 32 inflight) ## Security - Loopback only (127.0.0.1). Optional bearer token via EXAI_WS_TOKEN env - Rotate token by restarting daemon; store token in shim env only - Later: Windows Credential Manager integration if needed ## Process Model - Daemon: one process. Single-instance guard via lock file at logs/ws_daemon.pid - Client shim: one per MCP client/window; no global lock ## Dependencies (to confirm via EXAI) - websockets (server + client) OR aiohttp (server) + websocket-client (client) - Python 3.11+ ## Files (proposed) - src/daemon/ws_server.py # WebSocket daemon (asyncio) - src/daemon/session_manager.py # Tracks sessions, quotas, routing - src/daemon/router.py # Minor routing/dispatch integration - scripts/stdio_ws_shim.py # Stdio<->WS bridge for VS Code MCP - scripts/run_ws_daemon.py # Run daemon locally (dev) - scripts/run_ws_shim.py # Launch shim (used by MCP config) - docs/architecture/ws_daemon/WS_Daemon_Skeleton.md # this doc ## Config - EXAI_WS_PORT=8765 (default) - EXAI_WS_TOKEN=<random> - EXAI_WS_BACKPRESSURE=32 - LOG_LEVEL=INFO ## MVP Test Plan 1) Start daemon: `python -X utf8 scripts/run_ws_daemon.py` 2) Configure MCP client to run shim as the server command (stdio) 3) Open 2 VS Code windows; both connect through shim to single daemon 4) Verify basic tools work (listmodels, version, chat ping) 5) Check logs: one daemon process; two sessions; no contention ## Open Questions for EXAI - Prefer websockets vs aiohttp for production stability on Windows? (memory, perf) - Any pitfalls for JSON-RPC over WS (framing, long responses)? - Minimal, safe auth approach for loopback? Token-only sufficient? - Recommended backpressure defaults for Tier-2 Kimi + GLM rate limits? - Health/metrics: what minimal counters/latency histograms are most useful? ## Risks - New dependency; must manage lifecycle - Session leaks if shims crash (add idle timeout) - Rate limit spikes if multiple windows send heavy requests simultaneously ## Next Steps (once EXAI confirms) - Implement daemon and shim skeletons (no heavy features) - Add health and minimal metrics - Integrate session_id tagging and quotas - Validate with two VS Code windows concurrently

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Zazzles2908/EX_AI-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server