Which integrations are available for this server?

Powers the server's RAG capabilities by providing text embeddings, document summarization, and context synthesis using local LLMs. Utilizes SQLite with FTS5 and vector storage to manage and search document memories through both semantic similarity and full-text search.

How do I use MCP Context Hub?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP Context Hub search my memory for the project requirements discussed last week" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP Context Hub

by DiegoNogueiraDev

Overview Schema Related Servers Score Discussions

TypeScript

Local

MCP Context Hub

Local MCP server (Node.js + TypeScript) for context optimization, RAG memory, semantic cache, and sub-MCP proxy. Designed to run on a machine with GPU (RTX 3060 Ti) + Ollama, acting as a single MCP endpoint for Claude.

Architecture

           Claude (Remote)
                |
        HTTP POST/GET/DELETE + Bearer Token
                |
    +-----------v-----------+
    |   Express (:3100)     |
    |   Auth + IP Allowlist |
    +-----------+-----------+
                |
    +-----------v-----------+
    |  McpServer (SDK v1)   |
    |                       |
    |  Tools:               |
    |   context_pack        |
    |   memory_search       |
    |   memory_upsert       |
    |   context_compress    |
    |   proxy_call          |
    +-+------+------+-----+-+
      |      |      |     |
      v      v      v     v
  Ollama  SQLite  Cache  ProxyMgr
  Client  Vector  LRU    (stdio
  (chat   Store   +TTL    sub-MCP)
  +embed  +FTS5
  +fallback)

Features

context_pack — Combines semantic + text search, deduplication, and LLM synthesis into a structured context bundle (summary, facts, next actions)
memory_search — Semantic similarity search over stored documents using vector embeddings
memory_upsert — Store documents with automatic chunking, embedding, and indexing
context_compress — Compress text into bullets, JSON, steps, or summary format to reduce token usage
proxy_call — Call tools on sub-MCP servers (e.g., filesystem) with optional post-processing (summarize, compress)

Requirements

Node.js >= 20
Ollama with the following models:
- llama3.1:8b-instruct-q4_K_M (primary chat)
- qwen2.5:7b-instruct-q4_K_M (fallback chat)
- nomic-embed-text:v1.5 (embeddings, 768 dims)

Quick Start

# 1. Clone and install
git clone https://github.com/DiegoNogueiraDev/mcp-context-hub.git
cd mcp-context-hub
npm install

# 2. Pull Ollama models
ollama pull llama3.1:8b-instruct-q4_K_M
ollama pull qwen2.5:7b-instruct-q4_K_M
ollama pull nomic-embed-text:v1.5

# 3. Configure environment
cp .env.example .env
# Edit .env and set MCP_AUTH_TOKEN to a secure random value

# 4. Start the server
npm run dev

Or use the setup script:

chmod +x scripts/setup.sh
./scripts/setup.sh
npm run dev

Usage

Health Check

curl http://localhost:3100/health
# {"status":"healthy","timestamp":"..."}

MCP Protocol

The server uses Streamable HTTP transport at /mcp. Initialize a session first:

# Initialize session
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Authorization: Bearer <your-token>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "initialize",
    "params": {
      "protocolVersion": "2025-03-26",
      "capabilities": {},
      "clientInfo": { "name": "my-client", "version": "1.0.0" }
    },
    "id": 1
  }'

Then call tools using the mcp-session-id header from the response:

# Store a document
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: <session-id>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "memory_upsert",
      "arguments": {
        "document_id": "my-doc",
        "content": "Your document text here...",
        "scope": "project",
        "tags": ["example"]
      }
    },
    "id": 2
  }'

# Search memories
curl -X POST http://localhost:3100/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "mcp-session-id: <session-id>" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "memory_search",
      "arguments": {
        "query": "your search query",
        "top_k": 5
      }
    },
    "id": 3
  }'

Sub-MCP Proxy

Configure sub-MCP servers via the PROXY_SERVERS environment variable:

PROXY_SERVERS='{"filesystem":{"command":"node","args":["node_modules/@modelcontextprotocol/server-filesystem/dist/index.js","/tmp"]}}' npm run dev

Then call tools on them via proxy_call:

{
  "name": "proxy_call",
  "arguments": {
    "server": "filesystem",
    "tool": "read_file",
    "arguments": { "path": "/tmp/example.txt" },
    "post_process": "none"
  }
}

Configuration

All settings via environment variables (see .env.example):

Variable	Default	Description
`MCP_AUTH_TOKEN`		Bearer token for authentication
`MCP_ALLOWED_IPS`	`127.0.0.1,::1`	Comma-separated allowed IPs
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API URL
`PRIMARY_MODEL`	`llama3.1:8b-instruct-q4_K_M`	Primary chat model
`FALLBACK_MODEL`	`qwen2.5:7b-instruct-q4_K_M`	Fallback chat model
`EMBEDDING_MODEL`	`nomic-embed-text:v1.5`	Embedding model
`PORT`	`3100`	Server port
`HOST`	`0.0.0.0`	Server host
`DB_PATH`	`./data/context-hub.db`	SQLite database path
`CACHE_TTL_MS`	`300000`	Cache TTL (5 minutes)
`CACHE_MAX_ENTRIES`	`100`	Max cache entries
`LOG_LEVEL`	`info`	Log level (debug, info, warn, error)
`PROXY_SERVERS`	`{}`	Sub-MCP server configs (JSON)

Commands

npm run dev          # Start dev server (HTTP on :3100)
npm run dev:stdio    # Start in stdio mode (for local MCP testing)
npm run build        # Compile TypeScript
npm start            # Run compiled output
npm test             # Run tests (31 tests, 6 files)
npm run typecheck    # Type-check without emitting
npm run health       # Run health check script

Project Structure

src/
  config.ts                  # Environment configuration
  index.ts                   # Entry point + graceful shutdown
  db/
    connection.ts            # SQLite singleton (WAL mode)
    migrations.ts            # Table definitions (documents, chunks, FTS5, audit)
    cosine.ts                # Cosine similarity + embedding serialization
  server/
    mcp-server.ts            # McpServer setup + tool registration
    transport.ts             # Express + Streamable HTTP transport
    session.ts               # Session management
  middleware/
    auth.ts                  # Bearer token validation
    ip-allowlist.ts          # IP restriction
    audit.ts                 # Tool call logging
  tools/
    schemas.ts               # Zod schemas for all tools
    context-pack.ts          # context_pack implementation
    memory-search.ts         # memory_search implementation
    memory-upsert.ts         # memory_upsert implementation
    context-compress.ts      # context_compress implementation
    proxy-call.ts            # proxy_call implementation
  services/
    ollama-client.ts         # Ollama API (chat + embed + fallback)
    sqlite-vector-store.ts   # Vector store (SQLite + brute-force cosine)
    text-search.ts           # FTS5 full-text search
    chunker.ts               # Recursive text splitter
    dedup.ts                 # Content hashing + Jaccard dedup
    semantic-cache.ts        # LRU + TTL in-memory cache
    proxy-manager.ts         # Sub-MCP stdio connections
  utils/
    logger.ts                # Pino structured logging
    metrics.ts               # In-memory call metrics
    retry.ts                 # Exponential backoff retry
    tokens.ts                # Token estimation
  types/
    index.ts                 # Type re-exports
    ollama.ts                # Ollama API types
    vector-store.ts          # VectorStore interface
tests/
  unit/                      # cosine, chunker, dedup, cache
  integration/               # sqlite vector store
  e2e/                       # Express server

Tech Stack

Runtime: Node.js 20, TypeScript
MCP SDK: @modelcontextprotocol/sdk v1.26
HTTP: Express v5 + Streamable HTTP transport
Database: SQLite (better-sqlite3) with WAL mode, FTS5
Embeddings: Ollama nomic-embed-text:v1.5 (768 dimensions)
Chat: Ollama with automatic model fallback
Validation: Zod v4
Logging: Pino
Testing: Vitest

License

MIT

This server cannot be installed

license - not found

quality - not tested

maintenance - not tested

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DiegoNogueiraDev/mcp-context-hub'

If you have feedback or need assistance with the MCP directory API, please join our Discord server