MCP Puppet Pipeline

mcp-puppet-pipeline
attached_assets

Pasted-You-are-building-a-complete-Model-Context-Protocol-MCP-server-that-acts-as-a-bridge-between-web-LL-1758079870945_1758079870945.txt•6.25 kB

You are building a complete Model Context Protocol (MCP) server that acts as a bridge between web LLM clients and my local resources (local LLMs + local “code nodes”). Deliver a production-ready Node.js/TypeScript project that I can run on my machine and expose to a cloud LLM via a single MCP endpoint. # Goals - Provide a single MCP server exposing tools for: 1) Local LLM inference via Ollama (http://localhost:11434) and optionally LM Studio (http://localhost:1234/v1). 2) Local “code nodes” execution: either (a) HTTP POST to a local service, or (b) spawn a shell command safely with a timeout and whitelist. 3) Optional HTTP fetch passthrough for cloud services I host (“Cloud Code”). - The server must run locally, but be reachable from a browser-based LLM via WebSocket. Include a simple HTTP+WebSocket transport so the web LLM can connect to ws://MYHOST:PORT/mcp. # Constraints & assumptions - Use TypeScript + Node 18+. - Use an MCP server SDK if available (e.g., "modelcontextprotocol" npm package); if the exact package/name differs, resolve and pin the correct one. If no stable SDK, implement JSON-RPC 2.0 with the MCP method shapes (tools/list, tools/call, resources/list, etc.). - Do not rely on experimental, undocumented APIs without a fallback. - Provide .env-driven config. # Project layout - package.json (with scripts: dev, build, start) - tsconfig.json - src/server.ts (entry point; starts HTTP + WebSocket; registers tools) - src/mcp.ts (MCP plumbing: JSON-RPC handler, tools registry) - src/tools/ollama.ts (POST to /api/generate or /api/chat per config) - src/tools/lmstudio.ts (OpenAI-compatible endpoint; requires API base in .env) - src/tools/codeNode.ts (two modes: HTTP POST to local service; or spawn a whitelisted command) - src/tools/httpFetch.ts (safe HTTP fetch passthrough with allowlist) - src/util/logger.ts - src/util/validators.ts (zod schemas for inputs) - .env.example # Tools to expose (MCP "tools") 1) tool: "local_llm.generate" input schema: - provider: "ollama" | "lmstudio" - model: string - prompt: string - temperature?: number - stream?: boolean behavior: - If provider=ollama: call http://localhost:11434 with the correct route (support both /api/generate and /api/chat based on a boolean in .env OLLAMA_USE_CHAT). - If provider=lmstudio: call LM Studio OpenAI-compatible /v1/chat/completions or /v1/completions (pick one; document in README). Use optional LMSTUDIO_API_KEY from .env; handle no-key case gracefully (most local setups don’t need it). - Return { text, tokens, model, latencyMs }. 2) tool: "code_node.exec_http" input schema: - url: string (must match ALLOWLIST_URLS regex or be in ALLOWLIST_HOSTS) - method?: "POST"|"GET" (default POST) - headers?: record<string,string> - body?: any - timeoutMs?: number (default 30000) behavior: - Perform fetch with timeout, return { status, headers, body }. - Reject if URL is not allowed by allowlist. 3) tool: "code_node.exec_local" input schema: - cmd: string (must match allowed binaries list) - args?: string[] - cwd?: string - timeoutMs?: number (default 15000) behavior: - Spawn using child_process with a safe, explicit ALLOWED_BINARIES list from .env (e.g., python, node, bash, my-tool). - Kill on timeout. Return { exitCode, stdout, stderr, durationMs }. 4) tool: "cloud_http.fetch" (optional passthrough to my own Cloud Code) input schema like exec_http but with a different allowlist. # Transport - Expose HTTP GET /healthz that returns 200. - Expose WebSocket at /mcp that implements MCP JSON-RPC 2.0 framing. Include basic ping/pong keepalive. - Log minimal connection info and tool invocations (scrub secrets). # Security & safety - Never allow arbitrary file reads/writes. - Enforce allowlists: - ALLOWLIST_HOSTS and/or ALLOWLIST_URLS for HTTP tools. - ALLOWED_BINARIES for local exec. - Enforce payload size limits (e.g., 5 MB). - CORS: allow origins via ORIGIN_ALLOWLIST (comma-separated). - Add rate limiting (simple token bucket per connection). # Configuration (.env.example) PORT=8765 HOST=127.0.0.1 ORIGIN_ALLOWLIST=http://localhost:3000,https://my-web-llm.example ALLOWLIST_HOSTS=localhost,127.0.0.1 ALLOWLIST_URLS= ALLOWED_BINARIES=python,node OLLAMA_BASE=http://127.0.0.1:11434 OLLAMA_USE_CHAT=false LMSTUDIO_BASE=http://127.0.0.1:1234/v1 LMSTUDIO_API_KEY= MAX_BODY_BYTES=5242880 # Implementation notes - Prefer fastify or express + ws for transport. Use zod for input validation. Use node-fetch or undici for HTTP. - For MCP: implement methods: - "tools/list": returns the four tools above with JSON Schemas. - "tools/call": executes a tool by name with validated args. - "session/ready" (if needed by SDK) and simple "ping". - Include a minimal MCP README that documents how a web LLM client should connect (ws://HOST:PORT/mcp) and the exact payload shapes for tools/list and tools/call. - Provide typed response shapes and good errors. # Developer UX - README.md with: - “Quick start” (npm i, npm run dev). - Example WebSocket client snippet (browser) that calls tools/list then tools/call for local_llm.generate with Ollama and LM Studio. - curl examples for healthz and a demo tools/call over WebSocket (node script). - Troubleshooting: CORS, allowlists, timeouts, common 4xx/5xx. # Acceptance tests (must pass) - "npm run dev" starts on PORT from .env and prints ws://HOST:PORT/mcp ready. - GET /healthz returns 200. - Connecting via WebSocket and sending tools/list returns the four tools with schemas. - Calling local_llm.generate with provider=ollama, model=llama3, prompt="hello" returns text. - Calling code_node.exec_local with cmd=python and args=["-V"] returns exitCode 0 and stdout containing "Python". - Blocking test: code_node.exec_local with cmd=rm should return 403 explaining not in ALLOWED_BINARIES. - Blocking test: code_node.exec_http to disallowed host returns 403. # Deliverables - Complete repo with the files listed. - Fully typed TS code. - No TODOs left; runnable out of the box.

Latest Blog Posts

What Is Context Bloat in MCP?
By Om-Shree-0709 on December 16, 2025.
mcp
Context Bloat
MCP Moves to the Linux Foundation: Neutral Stewardship for Agentic Infrastructure
By Om-Shree-0709 on December 15, 2025.
mcp
anthropic
Linux Foundation
Code Execution with MCP: Architecting Agentic Efficiency
By Om-Shree-0709 on December 14, 2025.
mcp
Token bloat

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bermingham85/mcp-puppet-pipeline'

If you have feedback or need assistance with the MCP directory API, please join our Discord server