cachly-mcp-server
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| CACHLY_JWT | Yes | Your Keycloak JWT from cachly.dev/settings | |
| CACHLY_API_URL | No | Override for local dev | https://api.cachly.dev |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_instancesB | List all your cachly cache instances with their status and connection details. |
| create_instanceA | Create a new managed Valkey/Redis cache instance on cachly.dev. Free tier provisions in ~30 seconds. Paid tiers return a Stripe checkout URL. Available tiers: free (30 MB), dev (256 MB, €8/mo), pro (1 GB, €25/mo), speed (1 GB Dragonfly + Semantic Cache, €39/mo), business (8 GB, €99/mo). |
| get_instanceB | Get details and the Redis connection string for a specific cache instance. |
| get_connection_stringA | Get the Redis/Valkey connection string (redis:// URL) for a running instance. Use this to configure your application or set environment variables. |
| delete_instanceA | Permanently delete a cache instance. Deprovisions the Kubernetes workload and removes all data. This action is irreversible. |
| cache_getA | Get a value from a running cache instance by key. Returns the value (string or JSON) or null if the key does not exist. |
| cache_setA | Set a key-value pair in a running cache instance. Value can be a string or a JSON-serialized object. Optionally set a TTL in seconds. |
| cache_deleteC | Delete one or more keys from a running cache instance. |
| cache_existsA | Check whether one or more keys exist in the cache. Returns a count of existing keys. |
| cache_ttlA | Get the remaining time-to-live (TTL) of a key in seconds. Returns -1 if no TTL, -2 if the key does not exist. |
| cache_keysA | List keys in a cache instance matching an optional glob pattern (e.g. "user:", "session:"). Uses SCAN to avoid blocking the server. Returns at most |
| cache_statsB | Get real-time stats for a cache instance: memory usage, hit/miss rate, commands/sec, connected clients, keyspace info, and uptime. |
| semantic_searchA | Find cached entries that are semantically similar to a natural-language query. Powered by pgvector HNSW index on cachly infrastructure — embeddings never leave Germany. Requires OPENAI_API_KEY (or compatible) and the Speed/Business tier with CACHLY_VECTOR_URL. Example: "find all cached responses about password reset" or "what did we answer about pricing?" |
| detect_namespaceA | Classify a prompt into one of 5 semantic namespaces using text heuristics. Overhead: <0.1 ms, no embedding required. Useful to understand which namespace cachly will use for a given prompt. Returns one of: cachly:sem:code, cachly:sem:translation, cachly:sem:summary, cachly:sem:qa, cachly:sem:creative. |
| cache_warmupA | Pre-warm the semantic cache with a list of prompt/value pairs. For each entry: computes an embedding, checks if a similar entry already exists (similarity ≥ 0.98), and writes new entries to Valkey + pgvector index. Use this to seed FAQ responses, product descriptions, or known-good LLM answers before the first real user traffic. Requires OPENAI_API_KEY. |
| index_projectA | Index local source files into the cachly semantic cache so AI assistants can use semantic_search to find relevant files instead of re-reading the whole codebase every time. Walks a directory recursively, reads each matching file, and stores a summary + path as a semantic cache entry (prompt = file path + content excerpt, value = relative path). Requires an embedding provider (OPENAI_API_KEY or CACHLY_EMBED_PROVIDER + key). Run once, then re-run after major refactors. TTL=86400 (24h) keeps entries fresh. |
| cache_msetA | Set multiple key-value pairs in a single pipeline round-trip. Supports per-key TTL – unlike native MSET. Uses one TCP round-trip for N keys via Redis pipeline. |
| cache_mgetA | Retrieve multiple keys in one round-trip using native Redis MGET. Returns values in the same order as the keys array; missing keys are null. |
| cache_lock_acquireA | Acquire a distributed lock using Redis SET NX PX (Redlock-lite). Returns a fencing token on success. The lock auto-expires after ttl_ms to prevent deadlocks. Use cache_lock_release to free the lock early. |
| cache_lock_releaseA | Release a previously acquired distributed lock. Uses a Lua script for atomic release – only deletes the key if the fencing token matches. |
| get_api_statusA | Check the cachly API health and your authentication status. Returns whether the JWT is valid, your user ID (sub claim), token expiry, and the auth provider (keycloak). Use this to debug connection issues or verify your CACHLY_JWT is correct. |
| remember_contextA | Save context information to the cache so you can recall it later without re-computing. Perfect for caching: codebase overviews, file summaries, project structure, frequently-accessed data, or "thinking" results like dependency analysis. The AI assistant can use this to avoid re-reading the entire codebase every time. Example: remember_context("project overview", "This is a Next.js app with...") then later: recall_context("project overview") |
| recall_contextA | Retrieve previously saved context from the cache. Returns the saved content or null if not found. Use this at the START of any task to check if you already have relevant context cached, before doing expensive operations like reading many files. Supports glob patterns: "file:" matches all file summaries, "arch" matches architecture-related keys. |
| list_rememberedA | List all cached context entries for this project. Shows what knowledge the AI assistant has already cached, so you can decide whether to recall existing context or refresh it. Returns: key, category, size, TTL remaining, and a content preview. |
| forget_contextA | Delete one or more cached context entries. Use when context is stale or you want to force a fresh analysis. Supports glob patterns: "file:*" deletes all file summaries. |
| learn_from_attemptsA | Store a lesson learned from a failed or successful attempt. Call this AFTER completing any non-trivial task (deploy, debug, fix, architecture decision). The lesson will be recalled automatically in future sessions via recall_best_solution. Fields: topic (short slug like "deploy:web"), outcome ("success"|"failure"), what_worked (what solved it), what_failed (what did NOT work), context (extra details). Supports structured metadata: severity, file_paths (files involved), commands (working commands), tags. Deduplication: if a lesson for this topic already exists, it is updated instead of duplicated. Example: learn_from_attempts(topic="deploy:api", outcome="success", what_worked="nohup docker compose up -d --build", what_failed="docker compose up hangs on SSH timeout", severity="critical", commands=["nohup docker compose up -d --build"]) |
| recall_best_solutionA | Recall the best known solution for a topic from past lessons. Call this BEFORE attempting any task that might have been done before. Returns the most recent successful lesson for the topic, or a summary of attempts. Example: recall_best_solution(topic="deploy:web") → returns the working deploy command. |
| smart_recallA | Semantically search cached context using natural language. Instead of exact key matching, finds context by meaning. Example: smart_recall("how does authentication work") → returns cached auth architecture summary. Falls back to remember_context keys if no semantic match is found. |
| session_startA | Single-call session briefing. Call this at the START of every session INSTEAD of multiple separate smart_recall/recall_best_solution calls. Returns: last session summary, recent lessons sorted by recency, relevant lessons for your focus area, open failures (topics with only failure outcomes), and brain health stats (lesson count, context count). Also saves a session start marker so session_end can compute duration. |
| session_endA | Save a session summary when you finish working. Records what was accomplished, files changed, and lesson count. The next session_start will show this summary as "Last session". Call this when ending a work session, before going idle, or before summarizing. |
| auto_learn_sessionA | Auto-learn from a list of session observations WITHOUT explicit learn_from_attempts calls. Pass what happened (commands run, errors seen, solutions found) and the brain classifies and stores lessons automatically. Use at session_end to capture everything you did, even if you forgot to call learn_from_attempts. Returns a summary of what was auto-stored. |
| sync_file_changesA | Associate recent file changes with brain knowledge. Pass a list of changed file paths (from |
| team_learnA | Store a lesson in a shared team brain so all team members benefit. Like learn_from_attempts, but REQUIRES an author name for attribution. Shows up in team_recall with "by " so the team knows who learned it. |
| team_recallA | Recall lessons from a shared team brain, showing who learned what. Works on any shared instance (all team members using the same instance_id). Shows author, recency, and severity for each lesson. Use this to onboard new team members or find who knows about a topic. |
| brain_doctorA | Check the health of your AI Brain and get actionable recommendations. Reports: lesson count, context entries, last session age, open failures, quality score. Returns a prioritized list of issues with fix instructions. |
| global_learnA | Store a lesson that applies across ALL your projects (cross-project knowledge). Global lessons are stored with the prefix cachly:global:lesson: and recalled from any instance via global_recall. Use for tool preferences, personal workflows, platform quirks, and universal gotchas. Example: global_learn(topic="bash:macos-arrays", lesson="Arrays work differently on macOS bash 3.2") |
| global_recallB | Recall cross-project lessons stored via global_learn. Returns all global lessons or those matching a topic filter. |
| publish_lessonA | Publish a lesson to the Cachly Public Brain (anonymized community knowledge base). Published lessons can be imported by other developers via import_public_brain. PII is stripped automatically. Visible under the framework/category tag. |
| import_public_brainC | Import community lessons from the Cachly Public Brain for a framework. Loads battle-tested, community-curated lessons into your brain instance. Available: nextjs, fastapi, go, docker, kubernetes, react, typescript, python, rust, laravel, rails, spring. |
| setup_ai_memoryA | One-shot setup of the cachly 3-layer AI Memory system for a project. Layer 1 — Storage: your cachly instance (Valkey, persistent across sessions) Layer 2 — Tools: learn_from_attempts + recall_best_solution + smart_recall (the memory API) Layer 3 — Autopilot: generates a copilot-instructions.md / .github/copilot-instructions.md that instructs any MCP-compatible AI to recall known solutions BEFORE each task and save lessons AFTER — fully automatic, zero manual effort. Returns the copilot-instructions.md content + provider-specific .mcp.json snippet. Optionally writes copilot-instructions.md directly to the project directory. |
| cache_stream_setA | Cache a list of string chunks (e.g. LLM token stream) via Redis RPUSH. Each chunk is stored as a separate list element under cachly:stream:{key}. Replay with cache_stream_get. |
| cache_stream_getA | Retrieve a previously cached stream as an ordered list of string chunks. Returns null on cache miss (key absent or empty list). Stored under cachly:stream:{key}. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/cachly-dev/cachly-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server