cachly — AI Cognitive Brain
The cachly-mcp-server provides persistent AI memory and managed Redis/Valkey caching for AI coding assistants, enabling long-term context retention across sessions while offering full cache management capabilities.
🧠 Session & Memory
Start/end sessions with full briefings and summaries (
session_start,session_end,session_handoff)Store and retrieve lessons from attempts (
learn_from_attempts,recall_best_solution)Cache and recall project context, architecture, and file summaries (
remember_context,recall_context,list_remembered,forget_context)Natural-language semantic search across all brain data (
smart_recall)Auto-classify session observations into lessons (
auto_learn_session)AI brain health check with recommendations (
brain_doctor)Associate git file changes with brain knowledge (
sync_file_changes)
⚙️ Instance Management
Create, list, inspect, and delete managed Valkey/Redis instances across pricing tiers
Retrieve connection strings and check API/auth status
🗄️ Cache Operations
Standard key-value operations with TTL (
cache_get,cache_set,cache_delete)Bulk pipeline operations (
cache_mget,cache_mset), key listing, TTL inspection, existence checksReal-time stats: memory, hit/miss rate, ops/sec (
cache_stats)Distributed locking via Redlock-lite (
cache_lock_acquire,cache_lock_release)LLM token stream caching and replay (
cache_stream_set,cache_stream_get)
🔍 Semantic Cache
Find cached entries by meaning using pgvector HNSW with hybrid BM25+vector search (
semantic_search)Auto-classify prompts into namespaces (
detect_namespace)Pre-warm semantic cache and index local source files for AI codebase navigation (
cache_warmup,index_project)
👥 Team & Global Knowledge
Share lessons across team members with attribution (
team_learn,team_recall)Store cross-project universal lessons (
global_learn,global_recall)Publish anonymized lessons to and import from the Cachly Public Brain community knowledge base
🚀 Setup & Automation
One-command setup of a 3-layer AI memory system with auto-generated copilot instructions and MCP config (
setup_ai_memory)
Provides tools for managing cachly cache instances, enabling AI assistants to create, list, monitor, and delete cache instances, perform cache operations, and utilize semantic search capabilities.
Integrates with Keycloak for JWT-based authentication to cachly services, allowing AI assistants to securely manage cache instances and operations.
Enables caching and semantic search capabilities for OpenAI projects through cachly instances, reducing LLM API calls and improving response times.
Provides Redis-compatible cache operations including get/set/delete, key inspection, distributed locks, and streaming cache for LLM tokens through cachly instances.
🧠 cachly AI Brain — MCP Server
Persistent memory for Claude Code, Cursor, GitHub Copilot, Windsurf, Cline & Zed.
Your AI remembers every lesson, every fix, every architecture decision — forever.
The Problem
Every morning, you open your AI coding assistant. It doesn't remember yesterday.
You explain your architecture. You explain the deployment process. You explain the bug you fixed last week.
The average developer wastes 45 minutes/day re-establishing context.
One Command. Fully Automatic.
npx @cachly-dev/mcp-server@latest setupRun it once. It handles everything:
Signs you in — one click in your browser, no password, no credit card
Detects your editors — Claude Code, Cursor, Windsurf, VS Code, Copilot, Cline & Zed
Writes the MCP config for every detected editor automatically
Creates
CLAUDE.mdwith Brain rules so your AI acts autonomouslyInstalls a git hook that learns from every commit automatically
Restart your editor. From now on your AI arrives pre-briefed — every session.
What happens after setup — everything is automatic
You never type another command. The Brain runs entirely in the background:
Trigger | What the Brain does automatically |
First tool call | Session starts, project gets indexed in background |
Before every task | AI recalls relevant past lessons |
During debugging | AI traces root causes through causal memory |
Before deploys | AI predicts failure risks from past patterns |
After every fix | AI stores the lesson with commands and file paths |
Every git commit | Hook extracts lessons from commit message |
Editor closes | Session summary saved for next time |
With vs. Without cachly
Situation | Without cachly | With cachly |
Session start | "What's your architecture?" | "Ready. 23 lessons, last session: deployed API." |
Known bug hits again | Re-researches from scratch | "You fixed this March 12, here's the exact command" |
After holiday / handoff | Context dead | Fully briefed in < 10 seconds |
New team member | Weeks to onboard |
|
Pre-deploy check | Hope nothing breaks | Brain predicts failures before they happen |
What makes cachly different
Feature | What it does |
| Root Cause Analysis through memory: problem → chain → solution. No other system does this. |
| Weekly garbage collector — detects contradictions, merges duplicates, expires stale lessons |
| Predicts failures before they happen based on past patterns |
Team Brain | Shared lessons across your whole team with author attribution |
Ambient Git | git hook auto-extracts lessons from every commit. Zero extra calls. |
Memory Crystals | Distills all lessons into a compact snapshot for instant session briefing |
11 languages | BM25+ search in EN, DE, FR, ES, IT, PT, ZH, JA, KO, AR, HE — no config |
The causal_trace moment:
causal_trace(problem="auth breaks after restart")
→ Root: k8s:namespace-terminating
→ Via: keycloak:jwks-race
→ Fix: PollUntilContextTimeout 3min ← used this March 12, worked30 minutes of git blame in one call.
The autopilot command:
Run autopilot once and it generates a fully self-managing CLAUDE.md or copilot-instructions.md — tailored to your actual Brain content. Every AI (Claude, Cursor, Copilot, Windsurf, Gemini) gets the full ruleset for your project. No copy-paste, no manual writing. As the Brain learns more, re-run autopilot to upgrade the file.
cachly vs. alternatives
cachly | mem0 | MemGPT / Letta | Plain CLAUDE.md | |
Persistent memory | ✅ | ✅ | ✅ | Manual |
MCP server (no code changes) | ✅ | ✅ | ❌ | ✅ |
Causal root cause analysis | ✅ | ❌ | ❌ | ❌ |
Fully automatic (no explicit calls) | ✅ | ❌ | ❌ | ❌ |
Failure prediction | ✅ | ❌ | ❌ | ❌ |
Team knowledge sharing | ✅ | Paid | ❌ | ❌ |
Git-ambient learning | ✅ | ❌ | ❌ | ❌ |
11-language search | ✅ | ❌ | ❌ | ❌ |
GDPR / EU servers | ✅ | ❌ | ❌ | ✅ |
Free tier forever | ✅ | Limited | ❌ | ✅ |
MCP Tools (89 total)
🧠 Session & Memory (most used)
Tool | What it does |
| Full briefing: last session summary, open failures, recent lessons, brain health |
| Save what you built, auto-extract lessons from summary + git log |
| Store structured lessons after any fix, deploy, or discovery |
| Best known solution for a topic — with success/failure history |
| Cache architecture findings, decisions, file summaries |
| Get exact context by key (supports glob) |
| BM25+ full-text search across all brain data — 11 languages |
| Root cause analysis through memory |
| Predict likely failures before they happen |
| Deduplicate and expire stale lessons |
| Full context recovery after hitting context window limit |
👥 Team Brain
Tool | What it does |
| Share lessons across the team with author attribution |
| Merge conflicting lessons into one canonical version |
| 6 specialist AI agents vote to resolve contradictory lessons |
| Distill all lessons into a Crystal for instant team context |
| Health check: lesson count, IQ boost %, open failures |
| Generate a self-managing |
| Cross-project universal lessons |
| Share/import community knowledge |
🌍 Knowledge Commons (Global)
Tool | What it does |
| Contribute verified lesson to global Knowledge Commons |
| Search community solutions by tech stack |
| Contribute with cryptographic provenance certificate |
| Context-weighted global search |
⚙️ Cache & Infrastructure
Tool | What it does |
| Manage Brain instances |
| Standard cache operations |
| Bulk pipeline (single round-trip) |
| Find cached entries by meaning |
| Index source files for semantic retrieval |
📋 Roadmap & Planning
Tool | What it does |
| Persistent project roadmap stored in Brain |
| List items or get the single most important next action |
FAQ
Does my AI need to call session_start manually?
No. Sessions start and end automatically on the first tool call and when the editor closes.
What happens to memory if I switch projects?
Memory is scoped per Brain instance. You can have one instance per project, or one shared instance across projects.
Can my whole team share the same Brain?
Yes. team_learn / team_recall share lessons with author attribution. memory_crystalize gives any new team member instant full context.
What is a Memory Crystal?
A compressed snapshot of all lessons distilled into a compact briefing. Injected at every session start so the AI arrives pre-briefed even with a cold context window.
What is causal_trace and why is it unique?
Given any error or problem, causal_trace walks the Causal Knowledge Graph (CKG) to find: the root cause, intermediate causes, and the exact fix that worked — including the date and commands used. No other memory system builds or queries a causal graph.
How does cachly learn from git commits?
The setup wizard installs a git post-commit hook. After each commit, the hook calls cls_ingest with the commit message and changed files. The Brain extracts lessons automatically — no session_end required.
What happens if I hit the context window limit mid-session?
Call compact_recover. It reconstructs full context from the Memory Crystal + recent sessions + WIP registry entries — typically restoring full context in one tool call.
Is my code sent to cachly servers?
No code content is stored. cachly stores: lesson text, commit messages, session summaries, and key-value context entries. All data is on German servers, GDPR-compliant.
Does cachly work without an internet connection?
No — cachly is a managed cloud service. The MCP server is a thin client; the Brain runs on cachly's infrastructure.
Manual Setup (alternative to the wizard)
{
"mcpServers": {
"cachly": {
"command": "npx",
"args": ["-y", "@cachly-dev/mcp-server@latest"]
}
}
}On the first tool call your AI will prompt you to sign in — takes 10 seconds.
{
"mcpServers": {
"cachly": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@cachly-dev/mcp-server@latest"]
}
}
}Pricing
Tier | RAM | Price | Best for |
Free | 25 MB | €0/mo forever | Dev & side projects |
Dev | 200 MB | €19/mo | Individual developers |
Pro | 900 MB | €49/mo | Teams |
Speed | 900 MB + Dragonfly | €79/mo | AI-heavy workloads |
Business | 7 GB | €199/mo | Scale-ups |
✅ All plans: German servers · GDPR-compliant · 99.9% SLA · No credit card for Free
Environment Variables
Set automatically by the setup wizard — only needed for manual configuration.
Variable | Default | Description |
| — | API token (set by wizard, or get from cachly.dev) |
| — | Default instance UUID (optional if passed per-call) |
|
| Override for self-hosted |
| unset | Set to |
🛠️ Ecosystem
Package | What it does |
← you are here | |
Cut LLM costs 60–90% in JS/TS apps | |
Terminal CLI — manage instances and brain |
Links
🌐 cachly.dev — Dashboard & free signup
📖 Docs — Full documentation
💬 GitHub Issues — Bug reports & feature requests
⭐ Star on GitHub — If cachly saves you time, a star means a lot!
Maintenance
Appeared in Searches
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/cachly-dev/cachly-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server