Powers the server's RAG capabilities by providing text embeddings, document summarization, and context synthesis using local LLMs.
Utilizes SQLite with FTS5 and vector storage to manage and search document memories through both semantic similarity and full-text search.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Context Hubsearch my memory for the project requirements discussed last week"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Context Hub
Local MCP server (Node.js + TypeScript) for context optimization, RAG memory, semantic cache, and sub-MCP proxy. Designed to run on a machine with GPU (RTX 3060 Ti) + Ollama, acting as a single MCP endpoint for Claude.
Architecture
Features
context_pack — Combines semantic + text search, deduplication, and LLM synthesis into a structured context bundle (summary, facts, next actions)
memory_search — Semantic similarity search over stored documents using vector embeddings
memory_upsert — Store documents with automatic chunking, embedding, and indexing
context_compress — Compress text into bullets, JSON, steps, or summary format to reduce token usage
proxy_call — Call tools on sub-MCP servers (e.g., filesystem) with optional post-processing (summarize, compress)
Requirements
Node.js >= 20
Ollama with the following models:
llama3.1:8b-instruct-q4_K_M(primary chat)qwen2.5:7b-instruct-q4_K_M(fallback chat)nomic-embed-text:v1.5(embeddings, 768 dims)
Quick Start
Or use the setup script:
Usage
Health Check
MCP Protocol
The server uses Streamable HTTP transport at /mcp. Initialize a session first:
Then call tools using the mcp-session-id header from the response:
Sub-MCP Proxy
Configure sub-MCP servers via the PROXY_SERVERS environment variable:
Then call tools on them via proxy_call:
Configuration
All settings via environment variables (see .env.example):
Variable | Default | Description |
| Bearer token for authentication | |
|
| Comma-separated allowed IPs |
|
| Ollama API URL |
|
| Primary chat model |
|
| Fallback chat model |
|
| Embedding model |
|
| Server port |
|
| Server host |
|
| SQLite database path |
|
| Cache TTL (5 minutes) |
|
| Max cache entries |
|
| Log level (debug, info, warn, error) |
|
| Sub-MCP server configs (JSON) |
Commands
Project Structure
Tech Stack
Runtime: Node.js 20, TypeScript
MCP SDK:
@modelcontextprotocol/sdkv1.26HTTP: Express v5 + Streamable HTTP transport
Database: SQLite (
better-sqlite3) with WAL mode, FTS5Embeddings: Ollama
nomic-embed-text:v1.5(768 dimensions)Chat: Ollama with automatic model fallback
Validation: Zod v4
Logging: Pino
Testing: Vitest
License
MIT