Uses Google Gemini to generate high-dimensional vector embeddings for semantic storage and retrieval of natural language memories.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Arca MCPsearch for my notes about the marketing strategy"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Arca MCP
A Model Context Protocol (MCP) server providing semantic memory storage and retrieval via vector embeddings. Built with FastAPI + FastMCP, using LanceDB for vector storage and Google Gemini for embedding generation.
Features
Semantic Search — Store and retrieve memories using natural language queries powered by vector similarity search
Dual Access — MCP tools for AI agents + REST API for programmatic integrations
Multi-Tenant Isolation — Namespace-scoped operations via
X-NamespaceHTTP headerBucket Organization — Group memories into logical buckets for structured storage
Embedding Caching — Redis-backed cache for generated embeddings to minimize API calls
Bearer Token Auth — Constant-time token verification for secure access
Prerequisites
Python 3.14+
UV package manager
Redis
Google API key (for Gemini embeddings)
Quick Start
# Clone the repository
git clone https://github.com/your-org/arca-mcp.git
cd arca-mcp
# Install dependencies
uv sync --locked
# Configure environment
cp .env.example .env
# Edit .env with your ARCA_GOOGLE_API_KEY and ARCA_APP_AUTH_KEY
# Run the server
python -m appThe server starts on http://0.0.0.0:4201 by default, with MCP available at /app/mcp and REST API at /v1.
Configuration
All settings are configured via environment variables with the ARCA_ prefix, or through a .env file.
Variable | Type | Default | Description |
|
|
| Server bind address |
|
|
| Server port |
|
|
| Uvicorn worker count |
|
| required | Bearer token for MCP authentication |
|
|
| MCP transport ( |
|
|
| Enable debug mode |
|
|
| Maximum log message length |
|
| required | Google API key for Gemini embeddings |
|
|
| Gemini embedding model name |
|
|
| Embedding vector dimensionality |
|
|
| LanceDB storage directory |
|
|
| Redis host |
|
|
| Redis port |
|
|
| Redis database number for cache |
|
|
| Redis password (optional) |
|
|
| Default cache TTL in seconds (1 hour) |
|
|
| Long cache TTL in seconds (7 days, used for embeddings) |
MCP Tools
All tools are mounted under the memory namespace. Operations are scoped to the namespace provided via the X-namespace HTTP header (defaults to "default").
memory/add
Store content in memory with a vector embedding.
Parameter | Type | Required | Description |
|
| yes | Content to store |
|
| no | Bucket name (defaults to |
|
| no | UUIDs of nodes to link at creation time |
|
| no | Parallel relationship labels for |
Returns: { "status": "Memory added", "memory_id": "<uuid>" }
memory/get
Retrieve memories via semantic similarity search.
Parameter | Type | Required | Description |
|
| yes | Natural language search query |
|
| no | Filter by bucket |
|
| no | Number of results (default: |
Returns: { "status": "Memory retrieved", "results": [...] }
memory/delete
Delete a specific memory by its UUID.
Parameter | Type | Required | Description |
|
| yes | UUID of the memory to delete |
Returns: { "status": "Memory deleted" }
memory/clear
Clear all memories in a bucket.
Parameter | Type | Required | Description |
|
| no | Bucket to clear (defaults to |
Returns: { "status": "Memories cleared" }
memory/list_buckets
List all buckets in the current namespace.
Parameters: None
Returns: { "buckets": ["default", "work", ...] }
memory/connect
Create a directed edge between two memory nodes.
Parameter | Type | Required | Description |
|
| yes | UUID of the source node |
|
| yes | UUID of the target node |
|
| yes | Edge label (e.g. |
Returns: { "status": "Memories connected" }
memory/disconnect
Remove one or all directed edges between two nodes.
Parameter | Type | Required | Description |
|
| yes | UUID of the source node |
|
| yes | UUID of the target node |
|
| no | If provided, only remove this edge label; otherwise remove all edges |
Returns: { "status": "Memories disconnected" }
memory/traverse
Traverse the knowledge graph starting from a node.
Parameter | Type | Required | Description |
|
| yes | UUID of the starting node |
|
| no | Filter traversal to this edge label |
|
| no | Number of hops (default: |
Returns: { "status": "Graph traversed", "results": [...] } — each result includes a _depth field.
REST API
All REST endpoints are under /v1, require a Authorization: Bearer <token> header, and accept an optional X-Namespace header (defaults to "default").
Interactive API docs are available at /docs when the server is running.
Method | Path | Description |
|
| Add a memory |
|
| Semantic similarity search |
|
| Delete a specific memory |
|
| Clear memories in a bucket |
|
| List all buckets |
|
| Create a directed edge between two nodes |
|
| Remove edges between two nodes |
|
| Traverse the knowledge graph from a node |
Examples
All examples assume the server is running at localhost:4201. Replace $TOKEN with your ARCA_APP_AUTH_KEY.
Add a memory
curl -X POST http://localhost:4201/v1/memories \
-H "Authorization: Bearer $TOKEN" \
-H "X-Namespace: my_project" \
-H "Content-Type: application/json" \
-d '{"content": "User prefers dark mode", "bucket": "preferences"}'{
"status": "Memory added",
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}Search memories
curl -X POST http://localhost:4201/v1/memories/search \
-H "Authorization: Bearer $TOKEN" \
-H "X-Namespace: my_project" \
-H "Content-Type: application/json" \
-d '{"query": "what theme does the user like?", "top_k": 3}'{
"status": "Memory retrieved",
"results": [
{
"memory_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"content": "User prefers dark mode",
"bucket": "preferences"
}
]
}Delete a memory
curl -X DELETE http://localhost:4201/v1/memories/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
-H "Authorization: Bearer $TOKEN" \
-H "X-Namespace: my_project"{
"status": "Memory deleted"
}Clear a bucket
curl -X DELETE "http://localhost:4201/v1/memories?bucket=preferences" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Namespace: my_project"{
"status": "Memories cleared"
}List buckets
curl http://localhost:4201/v1/buckets \
-H "Authorization: Bearer $TOKEN" \
-H "X-Namespace: my_project"{
"buckets": ["default", "preferences", "work"]
}Other Endpoints
Method | Path | Description |
|
| Index — returns |
|
| Health check — returns status, version, uptime, exec ID |
|
| Interactive OpenAPI documentation |
|
| MCP streamable-http endpoint |
Docker
# Build
docker build -t arca-mcp .
# Run
docker run -p 4201:4201 \
-e ARCA_APP_AUTH_KEY=your-secret-key \
-e ARCA_GOOGLE_API_KEY=your-google-api-key \
-e ARCA_REDIS_HOST=host.docker.internal \
arca-mcpThe Docker image uses Python 3.14 slim with UV for dependency management.
MCP Client Configuration
Example .mcp.json for connecting an MCP client (e.g., Claude Code):
{
"mcpServers": {
"arca_memory": {
"type": "http",
"url": "http://localhost:4201/app/mcp",
"headers": {
"Authorization": "Bearer <your-auth-key>",
"X-namespace": "my_namespace"
}
}
}
}Architecture
┌─ /app/mcp → FastMCP Auth → MCP Tool Handler ─┐
Request → FastAPI ├→ Gemini Embedding (Redis cache) → LanceDB
└─ /v1/* → Bearer Auth → REST Router ───────┘
↑
X-Namespace header (multi-tenancy)Module Layout
app/
├── __main__.py # Uvicorn entry point
├── main.py # FastAPI app, lifespan, MCP mount, REST router
├── api/
│ ├── deps.py # Shared dependencies (auth, namespace extraction)
│ └── memory.py # REST API router for memory operations
├── context/
│ └── memory.py # MCP tool definitions (add, get, delete, clear, list_buckets)
├── core/
│ ├── config.py # Pydantic BaseSettings with ARCA_ env prefix
│ ├── db.py # LanceDB async connection management
│ ├── cache.py # Redis cache wrapper
│ ├── ai.py # Google Gemini AI client
│ └── log.py # Loguru logging configuration
├── schema/
│ ├── memory.py # REST API request/response models
│ └── status.py # Response models (HealthCheckResponse, IndexResponse)
└── util/
├── embeds.py # Embedding generation with Redis caching
└── memory.py # Core memory CRUD against LanceDB (PyArrow schema)