knitbrain
Optimizes OpenAI API requests through a proxy that compresses token usage, especially for past turns and bulk content, while keeping instructions verbatim and enabling retrieval of originals.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@knitbrainoptimize my current context window"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Knit Brain
Local-first MCP server that gives any AI coding agent per-project memory, workflow intelligence, and always-on, lossless token & context optimization — entirely on your machine, zero cloud.
Pure TypeScript. Two runtime dependencies. No Python, no native binaries, no network beyond npm install.
Why
Coding agents burn context on bulk they rarely re-read in full — large files, logs, JSON, stale tool output, old turns. Knit Brain shrinks that bulk to a navigable skeleton while keeping the exact original one lookup away:
your context window lasts dramatically longer,
nothing is ever lost — compression is reversible via a local content-addressed store (CCR),
your instructions and governance text are never touched (protected verbatim).
Measured on real mixed files: 106,268 → 15,316 tokens (85.6% saved), every byte recoverable. Savings are workload-dependent; redundant JSON/code compresses hardest, declaration-only files pass through untouched (output is never larger than input — enforced).
Related MCP server: LocalNest MCP
How it works
One brain, two doors, one lossless store:
MCP server (
knitbrain) — 21 tools: memory (learnings, session handoff), knowledge graph (imports/exports/dependents), workflow classification, project-specific agent generation, a shared team board, a context-window meter (warns and tells the agent to save a handoff before the window blows), and explicitoptimize/retrieve. Every data payload flows through one dispatch chokepoint where it's compressed structure-preservingly (JSON keeps its schema, code keeps its signatures) and tagged with a⟨ccr:hash⟩handle.Proxy (
knitbrain-proxy) — a loopback HTTP proxy in front of the LLM API (provider auto-detected per request: Anthropic/v1/messages, OpenAI/v1/chat/completions). Compresses the full request — old turns harder than recent ones, pasted bulk inside your message compressed while your directive stays verbatim — and streams the response back.CCR store — content-addressed (SHA-256 = handle), integrity-checked on every read, atomic writes, tiered retention (hot → cold gzip archive → budgeted purge). The pristine original is always one
retrieveaway, which is what makes aggressive compression safe.Self-tuning — a feedback loop watches which compressed payloads actually get retrieved and backs off per content-kind. A wrong tuning only costs efficiency, never correctness.
Quickstart
npm install -g knitbrain
# in your project:
knitbrain setup # detects your platform (Claude Code / Cursor / VS Code / Codex)
# and writes its NATIVE integration: .mcp.json, slash commands,
# rules files — non-clobbering
knitbrain dashboard # live local dashboard (127.0.0.1:8790): context meter,
# tokens saved, CCR tiers, self-tuning stats, team board
# optional — route LLM requests through the optimizer (API-key setups):
knitbrain-proxy # listens on 127.0.0.1:8788
export ANTHROPIC_BASE_URL=http://127.0.0.1:8788
# teams — shared optimized sessions (one URL + one token):
knitbrain hub # start the team hub (host runs this once)
knitbrain join <hub-url> <token> <name> # everyone else; postings mirror automaticallyGuarantees (enforced by gated tests, not promises)
Lossless — every compressed payload recovers byte-for-byte from CCR; the round-trip test gates the build.
Never-expand — output tokens ≤ input tokens, always.
Governance verbatim — protocol/classification text is never skeletonized.
Local-first — proxy binds
127.0.0.1only; nothing leaves your machine.
Development
npm install
npm run verify # typecheck → lint → test → build → bench (all must pass)
npm run e2e # built-artifact E2E: stdio session + real-file compression
npm run audit:prod # cold-start proof: clone → install → pack → installed binaries → all 21 toolsCurrent proof status: 106 tests passing, and the production audit (audit:prod) passes — fresh clone, clean install, packed tarball installed into a new project, all 21 tools and both binaries verified working. One opt-in test (live LLM endpoint) requires your own API key: KNITBRAIN_LIVE_TEST=1 ANTHROPIC_API_KEY=… npm test.
License
MIT © Piyush Dua
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/PDgit12/knitbrain'
If you have feedback or need assistance with the MCP directory API, please join our Discord server