The BCBA server is an AI-driven MCP platform combining real-time web search, enterprise AI services, data transformation, and optional persistent session memory.
Search & Discovery
Web search (
brave_web_search): Real-time searches via Brave Search API with pagination, filtering, and up to 20 results per requestLocal business search (
brave_local_search): Find nearby businesses with addresses, ratings, phone numbers, and hours; auto-falls back to web search if no local resultsAI-grounded answers (
brave_answers): Concise, direct answers grounded in live web results via Brave's AI GroundingEnterprise search: Domain-specific document retrieval via Vertex AI Discovery Engine, with a hybrid pipeline that combines and deduplicates web + curated results
Data Transformation
Code-mode search variants (
brave_web_search_code_mode,brave_local_search_code_mode): Run a search and immediately apply a custom JavaScript script (in a secure QuickJS sandbox) to extract only needed fields, reducing context window usage by 85–95%Universal transformer (
code_mode_transform): Apply custom JavaScript extraction to raw output from any MCP tool — useful for GitHub issues, DOM snapshots, transcripts, and more
AI Analysis & Orchestration
Research paper analysis (
gemini_research_paper_analysis): Deep academic analysis using Gemini 2.0 Flash — summaries, critiques, key findings, literature reviews, or comprehensive analysisMulti-model orchestration: Supports Google Gemini and Claude via Vertex AI infrastructure with secure Application Default Credentials (ADC)
Session Memory (optional, requires Supabase)
Save immutable session logs, update project state for continuity, progressively load prior context, search accumulated knowledge, and prune old memories
Integrations: Brave Search, Google Gemini, Vertex AI, Gmail, Chrome DevTools Protocol, and Supabase
Provides real-time web and local search capabilities, including AI-powered answers, to enhance model context.
Facilitates data extraction and automated pipeline processing through Gmail OAuth integration.
Orchestrates various Google ecosystem services, including Gemini and Gmail, for cross-platform data retrieval.
Leverages Vertex AI infrastructure, specifically Discovery Engine for enterprise search and managed generative model deployment.
Enables deep research paper analysis and structured data synthesis using the Google Gemini API.
Provides a session memory layer for progressive context loading, work ledgers, and persistent state handoffs via Supabase REST APIs.
🧠 Prism MCP — The Mind Palace for AI Agents
Your AI agent forgets everything between sessions. Prism fixes that.
One command. Persistent memory. Zero cloud dependencies.
npx -y prism-mcp-serverWorks with Claude Desktop · Claude Code · Cursor · Windsurf · Cline · Gemini · Antigravity — any MCP client.
Why Prism?
Every time you start a new conversation with an AI coding assistant, it starts from scratch. You re-explain your architecture, re-describe your decisions, re-list your TODOs. Hours of context — gone.
Prism gives your agent a brain that persists. Save what matters at the end of each session. Load it back instantly on the next one. Your agent remembers what it did, what it learned, and what's left to do.
✨ What Makes Prism Different
🧠 Your Agent Learns From Mistakes
When you correct your agent, Prism tracks it. Corrections accumulate importance over time. High-importance lessons auto-surface as warnings in future sessions — and can even sync to your .cursorrules file for permanent enforcement. Your agent literally gets smarter the more you use it.
🕰️ Time Travel
Every save creates a versioned snapshot. Made a mistake? memory_checkout reverts your agent's memory to any previous state — like git revert for your agent's brain. Full version history with optimistic concurrency control.
🔮 Mind Palace Dashboard
A gorgeous glassmorphism UI at localhost:3000 that lets you see exactly what your agent is thinking:
Current State & TODOs — the exact context injected into the LLM's prompt
Interactive Knowledge Graph — force-directed neural graph with click-to-filter, node renaming, and surgical keyword deletion (v5.1)
Deep Storage Manager — preview and execute vector purge operations with dry-run safety (v5.1)
Session Ledger — full audit trail of every decision your agent has made
Time Travel Timeline — browse and revert any historical handoff version
Visual Memory Vault — browse VLM-captioned screenshots and auto-captured HTML states
Hivemind Radar — real-time active agent roster with role, task, and heartbeat
Morning Briefing — AI-synthesized action plan after 4+ hours away
Brain Health — memory integrity scan with one-click auto-repair

🧬 10× Memory Compression
Powered by a pure TypeScript port of Google's TurboQuant (ICLR 2026), Prism compresses 768-dim embeddings from 3,072 bytes → ~400 bytes — enabling decades of session history on a standard laptop. No native modules. No vector database required.
🐝 Multi-Agent Hivemind
Multiple agents (dev, QA, PM) can work on the same project with role-isolated memory. Agents discover each other automatically, share context in real-time via Telepathy sync, and see a team roster during context loading.
🖼️ Visual Memory
Save UI screenshots, architecture diagrams, and bug states to a searchable vault. Images are auto-captioned by a VLM (Claude Vision / GPT-4V / Gemini) and become semantically searchable across sessions.
🔭 Full Observability
OpenTelemetry spans for every MCP tool call, LLM hop, and background worker. Route to Jaeger, Grafana, or any OTLP collector. Configure in the dashboard — zero code changes.
🔒 GDPR Compliant
Soft/hard delete (Art. 17), full ZIP export (Art. 20), API key redaction, per-project TTL retention, and audit trail. Enterprise-ready out of the box.
🎯 Use Cases
Long-running feature work — Save state at end of day, restore full context next morning. No re-explaining.
Multi-agent collaboration — Dev, QA, and PM agents share real-time context without stepping on each other's memory.
Consulting / multi-project — Switch between client projects with progressive loading: quick (~50 tokens), standard (~200), or deep (~1000+).
Visual debugging — Save UI screenshots to searchable memory. Find that CSS bug from last week by description.
Team onboarding — New team member's agent loads the full project history instantly.
Behavior enforcement — Agent corrections auto-graduate into permanent .cursorrules / .clauderules rules.
Offline / air-gapped — Full SQLite local mode + Ollama LLM adapter. Zero internet dependency.
Morning Briefings — After 4+ hours away, Prism auto-synthesizes a 3-bullet action plan from your last sessions.
🆕 What's New in v5.1
🗑️ Deep Storage Mode — New
deep_storage_purgetool reclaims ~90% of vector storage by dropping redundant float32 vectors. Safety guards: 7-day minimum age, dry-run preview, multi-tenant isolation.🕸️ Knowledge Graph Editor — Click nodes to rename or delete keywords. Filter by project, date, or importance. Surgically groom your agent's semantic memory.
🔧 Auto-Load Reliability — Hardened hook-based patterns for Claude Code and Gemini/Antigravity to guarantee context loading on the first turn.
🧪 303 Tests — Zero regressions across 13 suites.
🚀 Quick Start
Add to your MCP client config (claude_desktop_config.json, .cursor/mcp.json, etc.):
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"]
}
}
}That's it. Restart your client. All 30+ tools are available. Dashboard at http://localhost:3000.
Optional API keys:
GOOGLE_API_KEYfor semantic search + Morning Briefings,BRAVE_API_KEYfor web search. See Environment Variables.
📖 Setup Guides
Add to claude_desktop_config.json:
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"]
}
}
}Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"]
}
}
}Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"]
}
}
}Add to your Continue config.json or Cline MCP settings:
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"],
"env": {
"PRISM_STORAGE": "local",
"BRAVE_API_KEY": "your-brave-api-key"
}
}
}
}Claude Code supports SessionStart and Stop hooks that force the agent to load/save Prism context automatically.
1. Create the Hook Script
Save as ~/.claude/mcp_autoload_hook.py:
#!/usr/bin/env python3
import json, sys
def main():
print(json.dumps({
"continue": True,
"suppressOutput": True,
"systemMessage": (
"## First Action\n"
"Call `mcp__prism-mcp__session_load_context(project='my-project', level='deep')` "
"before responding to the user. Do not generate any text before calling this tool."
)
}))
if __name__ == "__main__":
main()2. Configure settings.json
{
"hooks": {
"SessionStart": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": "python3 /Users/you/.claude/mcp_autoload_hook.py",
"timeout": 10
}
]
}
],
"Stop": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": "python3 -c \"import json; print(json.dumps({'continue': True, 'suppressOutput': True, 'systemMessage': 'MANDATORY END WORKFLOW: 1) Call mcp__prism-mcp__session_save_ledger with project and summary. 2) Call mcp__prism-mcp__session_save_handoff with expected_version set to the loaded version.'}))\""
}
]
}
]
},
"permissions": {
"allow": [
"mcp__prism-mcp__session_load_context",
"mcp__prism-mcp__session_save_ledger",
"mcp__prism-mcp__session_save_handoff",
"mcp__prism-mcp__knowledge_search",
"mcp__prism-mcp__session_search_memory"
]
}
}Troubleshooting
Hook not firing? Check
timeout— if your script takes too long, Claude ignores it."Tool not available"? This is a hallucination. Ensure
permissions.allowexactly matches the double-underscore format.
Gemini-based agents (including Google's Antigravity IDE) require a fundamentally different approach to auto-load. This guide was refined over 14 iterations of real-world prompt engineering.
The Core Problem
When given verbose instructions with many constraints, Gemini models hallucinate that MCP tools "don't exist" or "aren't available" — even when they are. This is adversarial reasoning triggered by long rule lists, not a configuration issue.
The 2-Line Rule (Proven Fix)
Create ~/.gemini/GEMINI.md:
## First Action
Call `mcp_prism-mcp_session_load_context(project="my-project", level="deep")` before responding.Why this works:
Gemini uses single underscores for MCP tools (
mcp_prism-mcp_...) vs Claude's double underscores (mcp__prism-mcp__...)Keeping the instruction to 2 lines avoids triggering the model's adversarial "tool not found" reasoning
Framing as a positive "First Action" directive outperforms negative constraint lists
Antigravity UI Caveat
Antigravity does not visually render MCP tool output blocks in the chat UI. The tool executes successfully, but the user sees nothing. Fix this by adding an echo rule:
## Echo Context
After loading context, include in your text reply:
- Agent identity (role + name)
- Last session summary
- Open TODOs
- Session version numberThis ensures the user sees their project context even though the raw MCP output is invisible.
Session End Workflow
Tell the agent: "Wrap up the session." It should execute:
session_save_ledger— append immutable work log (summary, decisions, files changed)session_save_handoff— upsert project state withexpected_versionfor OCC
Tip: Include the session-end instructions in your
GEMINI.mdor ask the agent to save when you're done.
Key Findings from 14 Iterations
Iteration | What We Tried | Result |
1–6 | Verbose "Banned Behaviors" blocks, negative constraints | ❌ Model hallucinated tools were unavailable |
7–9 |
| ❌ Redundant configs caused race conditions |
10–11 | Emergency-style | ⚠️ Inconsistent — worked sometimes |
12–13 | Positive-only framing, progressively shorter | ⚠️ Better but still intermittent |
14 | 2-line "First Action" directive | ✅ Reliable across sessions |
Platform Gotchas
replace_file_contentsilently fails on~/.gemini/GEMINI.mdin some environments — usewrite_to_filewith overwrite insteadMultiple GEMINI.md locations can conflict: global (
~/.gemini/), workspace, and User Rules in the Antigravity UI. Keep them synchronizedCamoufox/browser tools called at startup spawn visible black windows — never call browser tools during greeting handlers
To sync memory across machines or teams:
{
"mcpServers": {
"prism-mcp": {
"command": "npx",
"args": ["-y", "prism-mcp-server"],
"env": {
"PRISM_STORAGE": "supabase",
"SUPABASE_URL": "https://your-project.supabase.co",
"SUPABASE_KEY": "your-supabase-anon-key"
}
}
}
}See Supabase Setup for schema migration instructions.
git clone https://github.com/dcostenco/prism-mcp.git
cd prism-mcp && npm install && npm run buildThen add to your MCP config:
{
"mcpServers": {
"prism-mcp": {
"command": "node",
"args": ["/path/to/prism-mcp/dist/server.js"],
"env": {
"BRAVE_API_KEY": "your-key",
"GOOGLE_API_KEY": "your-gemini-key"
}
}
}
}How Prism Compares
Prism MCP vs MCP Memory · Mem0 · Mnemory · Basic Memory
Only Prism has all of these:
✅ Zero config — one
npxcommand, no Qdrant/Postgres containers✅ Time Travel — versioned snapshots with
memory_checkout✅ Behavioral memory — importance tracking, auto-decay, mistake learning
✅ Visual dashboard — Mind Palace at localhost:3000
✅ Multi-agent sync — role-isolated Hivemind with real-time Telepathy
✅ Visual memory — VLM-captioned screenshot vault
✅ Token budgeting —
max_tokensparam on context loading✅ 10× vector compression — TurboQuant, no external vector DB
✅ GDPR compliance — soft/hard delete, ZIP export, TTL retention
✅ OpenTelemetry — full span tracing to Jaeger/Grafana
✅ LangChain adapters —
BaseRetrieverintegration + LangGraph examples✅ Morning Briefings — AI-synthesized action plans after breaks
✅ Auto-compaction — Gemini-powered rollups to prevent unbounded growth
✅ IDE rules sync — graduated insights →
.cursorrules/.clauderules✅ Air-gapped mode — SQLite + Ollama, zero internet needed
TL;DR: Prism is the only MCP memory server with time travel, behavioral learning, visual memory, multi-agent sync, and 10× compression — all from a single
npxcommand.
🔧 Tool Reference
Tool | Purpose |
| Append immutable session log (summary, TODOs, decisions) |
| Upsert latest project state with OCC version tracking |
| Progressive context loading (quick / standard / deep) |
| Full-text keyword search across accumulated knowledge |
| Prune outdated or incorrect memories (4 modes + dry_run) |
| Set per-project TTL retention policy |
| Vector similarity search across all sessions |
| Auto-compact old entries via Gemini summarization |
| GDPR-compliant deletion (soft/hard + Art. 17 reason) |
| Full ZIP export (JSON + Markdown) for portability |
| Brain integrity scan + auto-repair ( |
| Reclaim ~90% vector storage (v5.1) |
Tool | Purpose |
| Record corrections, successes, failures, learnings |
| Increase entry importance (+1) |
| Decrease entry importance (-1) |
| Sync graduated insights to |
| Visual memory vault |
Tool | Purpose |
| Browse all historical versions of a project's handoff state |
| Revert to any previous version (non-destructive) |
Tool | Purpose |
| Real-time internet search |
| Location-based POI discovery |
| JS extraction over web search results |
| JS extraction over local search results |
| Universal post-processing with 8 built-in templates |
| Academic paper analysis via Gemini |
| AI-grounded answers from Brave |
Requires PRISM_ENABLE_HIVEMIND=true.
Tool | Purpose |
| Announce yourself to the team |
| Pulse every ~5 min to stay visible |
| See all active teammates |
Environment Variables
Variable | Required | Description |
| No | Brave Search Pro API key |
| No |
|
| No |
|
| No | Instance name for multi-server PID isolation |
| No | Gemini — enables semantic search, Briefings, compaction |
| No | Separate Brave Answers key |
| If cloud | Supabase project URL |
| If cloud | Supabase anon/service key |
| No | Multi-tenant user isolation (default: |
| No |
|
| No | Comma-separated ports (default: |
| No |
|
| No | Dashboard port (default: |
Architecture
searchMemory() flow:
Tier 0: FTS5 keywords → Full-text search (knowledge_search)
Tier 1: float32 (3072B) → sqlite-vec cosine similarity (native)
Tier 2: turbo4 (400B) → JS asymmetricCosineSimilarity (fallback)
→ Tier 1 success → return results
→ Tier 1 fail → Tier 2 success → return results
→ Tier 2 fail → return []Every session_save_ledger call generates both tiers automatically:
Gemini generates float32 embedding (3,072 bytes)
TurboQuant compresses to turbo4 blob (~400 bytes)
Single atomic write stores both to the database
Metric | Before v5.0 | After v5.0 |
Storage per embedding | 3,072 bytes | ~400 bytes |
Compression ratio | 1:1 | ~7.7:1 (4-bit) |
Entries per GB | ~330K | ~2.5M |
Level | What You Get | Size | When to Use |
quick | Open TODOs + keywords | ~50 tokens | Fast check-in |
standard | + summary + recent decisions + Git drift | ~200 tokens | Recommended |
deep | + full logs (last 5 sessions) + cross-project knowledge | ~1000+ tokens | After a long break |
Prism resolves agent roles using a priority chain:
explicit tool argument → dashboard setting → "global" (default)Set your role once in the Mind Palace Dashboard (⚙️ Settings → Agent Identity) and it auto-applies to every session.
Available roles: dev, qa, pm, lead, security, ux, global, or any custom string.
src/
├── server.ts # MCP server core + tool routing
├── config.ts # Environment management
├── storage/
│ ├── interface.ts # StorageBackend abstraction
│ ├── sqlite.ts # SQLite local (libSQL + F32_BLOB)
│ ├── supabase.ts # Supabase cloud storage
│ └── configStorage.ts # Boot config micro-DB
├── dashboard/
│ ├── server.ts # Dashboard HTTP server
│ └── ui.ts # Mind Palace glassmorphism UI
├── tools/
│ ├── definitions.ts # Search & analysis schemas
│ ├── handlers.ts # Search & analysis handlers
│ ├── sessionMemoryDefinitions.ts
│ └── sessionMemoryHandlers.ts
└── utils/
├── telemetry.ts # OTel singleton
├── turboquant.ts # TurboQuant math core
├── imageCaptioner.ts # VLM auto-caption pipeline
└── llm/adapters/ # Gemini, OpenAI, Anthropic, OllamaCreate a Supabase project at supabase.com
Run the migration SQL files from
supabase/migrations/in orderSet
PRISM_STORAGE=supabase,SUPABASE_URL, andSUPABASE_KEYin your MCP configPrism auto-applies pending DDL migrations on startup via
prism_apply_ddlRPC
Prism includes Python adapters in examples/langgraph-agent/:
from langchain.retrievers import EnsembleRetriever
from prism_retriever import PrismMemoryRetriever, PrismKnowledgeRetriever
# Hybrid search: 70% semantic, 30% keyword
retriever = EnsembleRetriever(
retrievers=[PrismMemoryRetriever(...), PrismKnowledgeRetriever(...)],
weights=[0.7, 0.3],
)Includes a full 5-node LangGraph research agent with MCP bridge and persistent memory.
Version History
v5.0 — TurboQuant 10× embedding compression, three-tier search architecture
v4.6 — OpenTelemetry distributed tracing (Jaeger, Grafana)
v4.5 — VLM multimodal memory + GDPR Art. 20 ZIP export
v4.4 — Pluggable LLM adapters (OpenAI, Anthropic, Gemini, Ollama)
v4.3 — Knowledge Sync Rules (behavioral insights → IDE rules)
v4.2 — Project repo registry + universal auto-load
v4.1 — Auto-migration + multi-instance support
v4.0 — Behavioral memory (corrections, importance, auto-decay)
v3.1 — Memory lifecycle (TTL, auto-compaction, PKM export)
v3.0 — Agent Hivemind (role-scoped memory, Telepathy sync)
See CHANGELOG.md for full details.
🚀 Roadmap
Next (v5.2):
🔄 CRDT Handoff Merging — conflict-free concurrent multi-agent edits
⏰ Background Purge Scheduler — automated storage reclamation
📱 Mind Palace Mobile PWA — offline-first responsive dashboard
License
MIT
Keywords: MCP server, Model Context Protocol, Claude Desktop memory, persistent session memory, AI agent memory, local-first, SQLite MCP, Mind Palace, time travel, visual memory, VLM image captioning, OpenTelemetry, GDPR, agent telepathy, multi-agent sync, behavioral memory, cursorrules, Ollama MCP, Brave Search MCP, TurboQuant, progressive context loading, knowledge management, LangChain retriever, LangGraph agent