Skip to main content
Glama

Nexus-MCP

The only MCP server with hybrid search + code graph + semantic memory — fully local.

Nexus-MCP is a unified, local-first code intelligence server built for the Model Context Protocol. It combines vector search, BM25 keyword search, and structural graph analysis into a single process — giving AI agents precise, token-efficient code understanding without cloud dependencies.


Why Nexus-MCP?

AI coding agents waste tokens. A lot of them. Every time an agent reads full files to find a function, grep-searches for keywords that miss semantic intent, or makes multiple tool calls across disconnected servers — tokens burn. Nexus-MCP fixes this.

Token Efficiency: The Numbers

Scenario

Without Nexus

With Nexus

Savings

Find relevant code (agent reads 5-10 files manually)

5,000–15,000 tokens

500–2,000 tokens (summary mode)

70–90%

Understand a symbol (grep + read file + read callers)

3,000–8,000 tokens across 3-5 tool calls

800–2,000 tokens in 1 explain call

60–75%

Assess change impact (manual trace through codebase)

10,000–20,000 tokens

1,000–3,000 tokens via impact tool

80–85%

Tool descriptions in context (2 separate MCP servers)

~1,700 tokens (17 tools)

~1,000 tokens (15 consolidated)

40%

Search precision (keyword-only misses, needs retries)

2–3 searches × 2,000 tokens

1 hybrid search × 1,500 tokens

60–75%

Estimated savings per coding session: 15,000–40,000 tokens (30–60% reduction) compared to standalone agentic file browsing.

Three Verbosity Levels

Every tool respects a token budget — agents request only the detail they need:

Level

Budget

What's Returned

Use Case

summary

~500 tokens

Counts, scores, file:line pointers

Quick lookups, triage

detailed

~2,000 tokens

Signatures, types, line ranges, docstrings

Normal development

full

~8,000 tokens

Full code snippets, relationships, metadata

Deep analysis

vs. Standalone Agentic Development (No Code MCP)

Without a code intelligence server, AI agents must:

  • Read entire files to find one function (~500–2,000 tokens/file, often 5–10 files per query)

  • Grep for keywords that miss semantic intent ("auth" won't find "verify_credentials")

  • Manually trace call chains by reading file after file

  • Lose all context between sessions — no persistent memory

Nexus-MCP replaces this with targeted retrieval: semantic search returns the exact chunks needed, graph queries trace relationships instantly, and memory persists across sessions.

vs. Competitor MCP Servers

Feature

Nexus-MCP

Sourcegraph MCP

Greptile MCP

GitHub MCP

tree-sitter MCP

Local / private

Yes

No (infra required)

No (cloud)

No (cloud)

Yes

Semantic search

Yes (embeddings)

No (keyword)

Yes (LLM-based)

No (keyword)

No

Keyword search

Yes (BM25)

Yes

N/A

Yes

No

Hybrid fusion

Yes (RRF)

No

No

No

No

Code graph

Yes (rustworkx)

Yes (SCIP)

No

No

No

Re-ranking

Yes (FlashRank)

No

N/A

No

No

Semantic memory

Yes (6 types)

No

No

No

No

Change impact

Yes

Partial

No

No

No

Token budgeting

Yes (3 levels)

No

No

No

No

Languages

25+

30+

Many

Many

Many

Cost

Free

$$$

$40/mo

$10–39/mo

Free

API keys needed

No

Yes

Yes

Yes

No

vs. AI Code Tools (Cursor, Copilot, Cody, etc.)

Capability

Nexus-MCP

Cursor

Copilot @workspace

Sourcegraph Cody

Continue.dev

Aider

IDE-agnostic

Yes

No

No

No

No

Yes

MCP-native

Yes

Partial

No

No

Yes (client)

No

Fully local

Yes

Partial

No

Partial

Yes

Yes

Hybrid search

Yes

Unknown

Unknown

Keyword

Yes

No

Code graph

Yes

Unknown

Unknown

Yes (SCIP)

Basic

No

Semantic memory

Yes (persistent)

No

No

No

No

No

Token-budgeted responses

Yes

N/A

N/A

N/A

N/A

N/A

Open source

Yes (MIT)

No

No

Partial

Yes

Yes

Cost

Free

$20–40/mo

$10–39/mo

$0–49/mo

Free

Free

Nexus-MCP's unique combination: No other tool delivers hybrid search + code graph + semantic memory + token budgeting + full privacy in a single MCP server.


Key Features

  • Hybrid search — Vector (semantic) + BM25 (keyword) + graph (structural) fused via Reciprocal Rank Fusion, then re-ranked with FlashRank

  • Code graph — Structural analysis via rustworkx: callers, callees, imports, inheritance, change impact

  • Dual parsing — tree-sitter (symbol extraction) + ast-grep (structural relationships), 25+ languages

  • Semantic memory — Persistent knowledge store with TTL expiration, 6 memory types, semantic recall

  • Explain & Impact — "What does this do?" and "What breaks if I change it?" in single tool calls

  • Token-budgeted responses — Three verbosity levels (summary/detailed/full) keep context windows lean

  • Multi-folder indexing — Index multiple directories in one call, processed folder-by-folder with shared engines

  • Incremental indexing — Only re-processes changed files; file watcher support

  • Multi-model embeddings — 2 models (jina-code default, bge-small-en), GPU/MPS auto-detection

  • Low memory — <350MB RAM target (ONNX Runtime ~50MB, mmap vectors, lazy model loading)

  • Fully local — Zero cloud dependencies, no API keys, all processing on your machine

  • 15 tools, one server — Consolidates what previously required 2 MCP servers (17 tools) into one

Prerequisites

  • Python 3.10+ (tested on 3.10, 3.11, 3.12)

  • pip (comes with Python)

Install

pip install nexus-mcp-ci

With optional extras:

# With GPU (CUDA) support
pip install nexus-mcp-ci[gpu]

# With FlashRank reranker for better search quality
pip install nexus-mcp-ci[reranker]

# Both
pip install nexus-mcp-ci[gpu,reranker]

Option 2: From source (for development)

git clone https://github.com/jaggernaut007/Nexus-MCP.git
cd Nexus-MCP

# Setup script (creates venv, installs, verifies)
./setup.sh

# Or manual install with dev deps
pip install -e ".[dev]"

Note: The default embedding model (jina-code) requires ONNX Runtime. This is included automatically. If you see errors about missing ONNX/Optimum, run:

pip install "sentence-transformers[onnx]" "optimum[onnxruntime]>=1.19.0"

To use a lighter model that doesn't need trust_remote_code, set NEXUS_EMBEDDING_MODEL=bge-small-en.

See the full Installation Guide for all options, MCP client integration, and troubleshooting.

Run

nexus-mcp

The server starts on stdio (the default MCP transport). Point your MCP client at the nexus-mcp command.

Add to Your MCP Client

Claude Code

# Basic setup
claude mcp add nexus-mcp-ci -- nexus-mcp-ci

# With a specific embedding model
claude mcp add nexus-mcp-ci -e NEXUS_EMBEDDING_MODEL=bge-small-en -- nexus-mcp-ci

Tip: If you installed in a virtual environment, use the full path so the MCP client finds the right Python:

claude mcp add nexus-mcp-ci -- /path/to/Nexus-MCP/.venv/bin/nexus-mcp-ci

Claude Desktop

Add to your config file (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "nexus-mcp-ci": {
      "command": "nexus-mcp-ci",
      "args": []
    }
  }
}

Cursor / Windsurf / Cline / Other MCP Clients

Add to your MCP client's server config:

{
  "nexus-mcp-ci": {
    "command": "nexus-mcp-ci",
    "transport": "stdio"
  }
}

See the full Installation Guide for client-specific instructions.

MCP Tools (15)

Core

Tool

Description

status

Server status, indexing stats, memory usage

health

Readiness/liveness probe (uptime, engine availability)

index

Index a codebase (full, incremental, or multi-folder)

search

Hybrid code search with language/type filters and reranking

Graph Analysis

Tool

Description

find_symbol

Look up a symbol by name — definition, location, relationships

find_callers

Find all direct callers of a function

find_callees

Find all functions called by a given function

analyze

Code complexity, dependencies, smells, and quality metrics

impact

Transitive change impact analysis

explain

Combined graph + vector + analysis explanation of a symbol

overview

High-level project overview: files, languages, symbols, quality

architecture

Architectural analysis: layers, dependencies, entry points, hubs

Memory

Tool

Description

remember

Store a semantic memory with tags and TTL

recall

Search memories by semantic similarity

forget

Delete memories by ID, tags, or type

Configuration

All settings can be overridden via NEXUS_ environment variables:

Variable

Default

Description

NEXUS_STORAGE_DIR

.nexus

Storage directory for indexes

NEXUS_EMBEDDING_MODEL

jina-code

Embedding model (jina-code, bge-small-en)

NEXUS_EMBEDDING_DEVICE

auto

Device for embeddings: auto (CUDA > MPS > CPU), cuda, mps, cpu

NEXUS_MAX_FILE_SIZE_MB

10

Skip files larger than this

NEXUS_CHUNK_MAX_CHARS

4000

Max code snippet size per chunk

NEXUS_MAX_MEMORY_MB

350

Memory budget

NEXUS_SEARCH_MODE

hybrid

Search mode: hybrid, vector, or bm25

NEXUS_FUSION_WEIGHT_VECTOR

0.5

Vector engine weight in RRF

NEXUS_FUSION_WEIGHT_BM25

0.3

BM25 engine weight in RRF

NEXUS_FUSION_WEIGHT_GRAPH

0.2

Graph engine weight in RRF

NEXUS_LOG_LEVEL

INFO

Logging level

NEXUS_LOG_FORMAT

text

Log format: text or json

Self-Test Demo

Verify your installation by running the end-to-end demo that exercises all 15 tools:

python self_test/demo_mcp.py                  # Uses built-in sample project
python self_test/demo_mcp.py /path/to/project  # Or test against your own codebase

See self_test/README.md for details.

Development

pip install -e ".[dev]"     # Install with dev deps
pytest -v                   # Run tests (441 tests)
pytest -m "not slow"        # Skip performance benchmarks
ruff check .                # Lint
nexus-mcp-ci                # Run server

How It Works

search("how does auth work")
  |
  |-- vector_engine.search(query, n=30)    -- semantic similarity (embeddings)
  |-- bm25_engine.search(query, n=30)      -- keyword matching (exact terms)
  |-- graph_engine.boost(query, n=30)      -- structural relevance (callers/callees)
  |                                            |
  |              Reciprocal Rank Fusion (weights: 0.5 / 0.3 / 0.2)
  |                                            |
  |                        FlashRank re-ranking (top 20)
  |                                            |
  |                      Token budget truncation (summary/detailed/full)
  |                                            |
  v
  Top-N results, formatted to verbosity level

Architecture

Component

Technology

Why

Vector store

LanceDB

Disk-backed, mmap, ~20-50MB overhead, native FTS

Embeddings

ONNX Runtime + jina-code (default)

~50MB vs PyTorch ~500MB, GPU/MPS auto-detection, 3 models supported

Graph engine

rustworkx

Rust-backed, O(1) node/edge lookup, PageRank, centrality

Symbol parser

tree-sitter

25+ languages, AST-level symbol extraction

Graph parser

ast-grep

Structural pattern matching for calls/imports/inheritance

Chunking

Symbol-based

One chunk per function/class, deterministic IDs

Re-ranker

FlashRank (optional)

4MB ONNX model, <10ms for top-20

Persistence

SQLite + LanceDB

Graph in SQLite, vectors in Lance, zero-config

Documentation

  • Installation Guide — Prerequisites, install steps, MCP client integration, troubleshooting

  • Architecture — System design, data flow, components, memory budget

  • Usage Guide — Tool reference, configuration, best practices

  • Developer Guide — Setup, testing, contributing, adding tools/engines

  • ADRs — 11 Architecture Decision Records

  • Research Notes — Deep dives on libraries and technology choices

Acknowledgments

Nexus-MCP consolidates and extends two earlier projects:

  • CodeGrok MCP by rdondeti (Ravitez Dondeti) — Semantic code search with tree-sitter parsing, embedding service, parallel indexing, and memory retrieval. Core models, symbol extraction, and the embedding pipeline were ported from CodeGrok. Originally licensed under MIT.

  • code-graph-mcp by entrepeneur4lyf — Code graph analysis with ast-grep structural parsing, rustworkx graph engine, and complexity analysis. Graph models, relationship extraction, and code analysis were ported from code-graph-mcp.

Individual source files retain "Ported from" attribution in their module docstrings. See ADR-001 for the rationale behind the consolidation.

License

MIT — see LICENSE for details.

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jaggernaut007/Nexus-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server