Skip to main content
Glama

MCP Context Hub

Local MCP server (Node.js + TypeScript) for context optimization, RAG memory, semantic cache, and sub-MCP proxy. Designed to run on a machine with GPU (RTX 3060 Ti) + Ollama, acting as a single MCP endpoint for Claude.

Architecture

Claude (Remote) | HTTP POST/GET/DELETE + Bearer Token | +-----------v-----------+ | Express (:3100) | | Auth + IP Allowlist | +-----------+-----------+ | +-----------v-----------+ | McpServer (SDK v1) | | | | Tools: | | context_pack | | memory_search | | memory_upsert | | context_compress | | proxy_call | +-+------+------+-----+-+ | | | | v v v v Ollama SQLite Cache ProxyMgr Client Vector LRU (stdio (chat Store +TTL sub-MCP) +embed +FTS5 +fallback)

Features

  • context_pack — Combines semantic + text search, deduplication, and LLM synthesis into a structured context bundle (summary, facts, next actions)

  • memory_search — Semantic similarity search over stored documents using vector embeddings

  • memory_upsert — Store documents with automatic chunking, embedding, and indexing

  • context_compress — Compress text into bullets, JSON, steps, or summary format to reduce token usage

  • proxy_call — Call tools on sub-MCP servers (e.g., filesystem) with optional post-processing (summarize, compress)

Requirements

  • Node.js >= 20

  • Ollama with the following models:

    • llama3.1:8b-instruct-q4_K_M (primary chat)

    • qwen2.5:7b-instruct-q4_K_M (fallback chat)

    • nomic-embed-text:v1.5 (embeddings, 768 dims)

Quick Start

# 1. Clone and install git clone https://github.com/DiegoNogueiraDev/mcp-context-hub.git cd mcp-context-hub npm install # 2. Pull Ollama models ollama pull llama3.1:8b-instruct-q4_K_M ollama pull qwen2.5:7b-instruct-q4_K_M ollama pull nomic-embed-text:v1.5 # 3. Configure environment cp .env.example .env # Edit .env and set MCP_AUTH_TOKEN to a secure random value # 4. Start the server npm run dev

Or use the setup script:

chmod +x scripts/setup.sh ./scripts/setup.sh npm run dev

Usage

Health Check

curl http://localhost:3100/health # {"status":"healthy","timestamp":"..."}

MCP Protocol

The server uses Streamable HTTP transport at /mcp. Initialize a session first:

# Initialize session curl -X POST http://localhost:3100/mcp \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -H "Authorization: Bearer <your-token>" \ -d '{ "jsonrpc": "2.0", "method": "initialize", "params": { "protocolVersion": "2025-03-26", "capabilities": {}, "clientInfo": { "name": "my-client", "version": "1.0.0" } }, "id": 1 }'

Then call tools using the mcp-session-id header from the response:

# Store a document curl -X POST http://localhost:3100/mcp \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -H "mcp-session-id: <session-id>" \ -d '{ "jsonrpc": "2.0", "method": "tools/call", "params": { "name": "memory_upsert", "arguments": { "document_id": "my-doc", "content": "Your document text here...", "scope": "project", "tags": ["example"] } }, "id": 2 }' # Search memories curl -X POST http://localhost:3100/mcp \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -H "mcp-session-id: <session-id>" \ -d '{ "jsonrpc": "2.0", "method": "tools/call", "params": { "name": "memory_search", "arguments": { "query": "your search query", "top_k": 5 } }, "id": 3 }'

Sub-MCP Proxy

Configure sub-MCP servers via the PROXY_SERVERS environment variable:

PROXY_SERVERS='{"filesystem":{"command":"node","args":["node_modules/@modelcontextprotocol/server-filesystem/dist/index.js","/tmp"]}}' npm run dev

Then call tools on them via proxy_call:

{ "name": "proxy_call", "arguments": { "server": "filesystem", "tool": "read_file", "arguments": { "path": "/tmp/example.txt" }, "post_process": "none" } }

Configuration

All settings via environment variables (see .env.example):

Variable

Default

Description

MCP_AUTH_TOKEN

Bearer token for authentication

MCP_ALLOWED_IPS

127.0.0.1,::1

Comma-separated allowed IPs

OLLAMA_BASE_URL

http://localhost:11434

Ollama API URL

PRIMARY_MODEL

llama3.1:8b-instruct-q4_K_M

Primary chat model

FALLBACK_MODEL

qwen2.5:7b-instruct-q4_K_M

Fallback chat model

EMBEDDING_MODEL

nomic-embed-text:v1.5

Embedding model

PORT

3100

Server port

HOST

0.0.0.0

Server host

DB_PATH

./data/context-hub.db

SQLite database path

CACHE_TTL_MS

300000

Cache TTL (5 minutes)

CACHE_MAX_ENTRIES

100

Max cache entries

LOG_LEVEL

info

Log level (debug, info, warn, error)

PROXY_SERVERS

{}

Sub-MCP server configs (JSON)

Commands

npm run dev # Start dev server (HTTP on :3100) npm run dev:stdio # Start in stdio mode (for local MCP testing) npm run build # Compile TypeScript npm start # Run compiled output npm test # Run tests (31 tests, 6 files) npm run typecheck # Type-check without emitting npm run health # Run health check script

Project Structure

src/ config.ts # Environment configuration index.ts # Entry point + graceful shutdown db/ connection.ts # SQLite singleton (WAL mode) migrations.ts # Table definitions (documents, chunks, FTS5, audit) cosine.ts # Cosine similarity + embedding serialization server/ mcp-server.ts # McpServer setup + tool registration transport.ts # Express + Streamable HTTP transport session.ts # Session management middleware/ auth.ts # Bearer token validation ip-allowlist.ts # IP restriction audit.ts # Tool call logging tools/ schemas.ts # Zod schemas for all tools context-pack.ts # context_pack implementation memory-search.ts # memory_search implementation memory-upsert.ts # memory_upsert implementation context-compress.ts # context_compress implementation proxy-call.ts # proxy_call implementation services/ ollama-client.ts # Ollama API (chat + embed + fallback) sqlite-vector-store.ts # Vector store (SQLite + brute-force cosine) text-search.ts # FTS5 full-text search chunker.ts # Recursive text splitter dedup.ts # Content hashing + Jaccard dedup semantic-cache.ts # LRU + TTL in-memory cache proxy-manager.ts # Sub-MCP stdio connections utils/ logger.ts # Pino structured logging metrics.ts # In-memory call metrics retry.ts # Exponential backoff retry tokens.ts # Token estimation types/ index.ts # Type re-exports ollama.ts # Ollama API types vector-store.ts # VectorStore interface tests/ unit/ # cosine, chunker, dedup, cache integration/ # sqlite vector store e2e/ # Express server

Tech Stack

  • Runtime: Node.js 20, TypeScript

  • MCP SDK: @modelcontextprotocol/sdk v1.26

  • HTTP: Express v5 + Streamable HTTP transport

  • Database: SQLite (better-sqlite3) with WAL mode, FTS5

  • Embeddings: Ollama nomic-embed-text:v1.5 (768 dimensions)

  • Chat: Ollama with automatic model fallback

  • Validation: Zod v4

  • Logging: Pino

  • Testing: Vitest

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DiegoNogueiraDev/mcp-context-hub'

If you have feedback or need assistance with the MCP directory API, please join our Discord server