Which integrations are available for this server?

Provides an optional Neo4j backend for storing and querying the code property graph, enabling graph database capabilities.

mcp-reposkein by reposkein

skills.sh   Glama    

🔭 Live demo → — RepoSkein's own graph, rendered as an interactive 3D constellation in your browser.

Introduction

RepoSkein gives your AI coding agent a map of your codebase — so it navigates structure instead of grepping and guessing.

It uses Tree-sitter to build a deterministic Code Property Graph of your repo — files, classes, functions, imports, and call edges — and serves it to any MCP-capable agent (Claude Code, Cursor, Codex, …). As the agent works, it writes short natural-language summaries onto graph nodes; those summaries are versioned in git alongside the code, so an agent's understanding becomes shared team memory that the next agent — or teammate — starts from.

Who it's for: developers using AI coding agents on real, large, or nested/polyglot codebases, who are tired of the agent burning its context window on grep; and teams who want that hard-won understanding to persist and be shared rather than re-derived every session.

⚡ Zero-infra — no database, no Docker. The graph lives in committed .reposkein/*.jsonl files.
🔒 Deterministic — same code → byte-identical graph. No LLM in the construction path.
🌐 7 languages — Python, TypeScript, JavaScript, Rust, Go, Java, C#.
🧩 Local-first & git-native — the graph and its summaries travel with your code.

Your agent asks	RepoSkein answers — directly from the graph
"Who calls `charge()`?"	the exact callers, with one-line summaries
"What breaks if I change this?"	the impacted callers + the tests that cover them
"Where do I even start?"	ranked entry-point functions by meaning, not filename
"What usually changes with this file?"	co-change history from git

In a deterministic, no-LLM benchmark, RepoSkein surfaces the right functions with a mean ~8.4× fewer context tokens than a grep-based agent on structural queries.

Related MCP server: Octocode

Prerequisites

Node.js 18+ — to run npx @reposkein/mcp (the indexer binary is fetched automatically).
An MCP-capable agent — Claude Code, Cursor, Codex, Zed, etc.
A git repository to index (RepoSkein installs git hooks and commits the graph).
Optional: Docker (only for the embeddings server or the Neo4j backend); Rust (only to build from source).

Installation

In the repo you want your agent to understand:

npx @reposkein/mcp init

This downloads the indexer for your platform, installs git hooks + the navigation skill, builds the initial code graph, and prints an MCP config block. Then:

Add the printed config to your agent (e.g. Claude Code's .mcp.json):

{
  "mcpServers": {
    "reposkein": {
      "command": "reposkein-mcp",
      "env": { "REPOSKEIN_REPO_PATH": "/path/to/your/repo" }
    }
  }
}

Verify and commit the graph (init already built it):

reposkein-mcp doctor .         # ✓ binary  ✓ indexed (N nodes)  ✓ ready
git add .reposkein && git commit -m "add RepoSkein code graph"

Re-index after big changes with reposkein-mcp index . (or the agent's reindex_file tool).

Ask your agent "what calls this function?" or "what breaks if I change X?" — it answers from the graph.

Prefer to let your agent set it up? Install the skills and tell it to run the reposkein-setup skill — it installs, indexes, and verifies everything:
npx skills add reposkein/reposkein --all

Platforms: prebuilt binaries for macOS (Apple Silicon), Linux (x64/arm64), and Windows (x64). Elsewhere, point REPOSKEIN_INDEXER_BIN at a from-source build.

Let your agent install it for you

For complex setups — multi-repo workspaces, Neo4j backend, the local embedding server, or wiring up agents besides Claude Code (OpenCode, Cursor, Codex, Continue, Cline, …) — paste this into any MCP-capable agent and it'll walk you through:

Install RepoSkein in this workspace. Read docs/INSTALL.md (or https://github.com/reposkein/reposkein/blob/main/docs/INSTALL.md), walk me through the question tree in §1, then execute §2 onward. If anything fails, troubleshoot via §9 — don't silently skip steps. Confirm with reposkein-mcp doctor . per repo and a semantic_find smoke test before claiming done.

docs/INSTALL.md is written for agents: it covers the decision tree (one repo vs workspace, JSONL vs Neo4j, lexical vs cloud vs local embeddings, which agent CLIs to wire), per-agent config schemas (.mcp.json, opencode.json, .cursor/mcp.json, Continue, Codex, Cline, …), the Apple-Silicon mps native embed-server recipe, and a troubleshooting table.

Usage — working with your agent

You ask in plain language; the bundled reposkein-graph-rag skill drives the tools. The natural loop:

Find where to start — semantic_find("jwt auth validation") ranks the right functions by meaning, no symbol name needed. → "where's the rate limiter?"
Understand it — get_context_profile returns the node's callers + callees as ready-to-read prose (hops: 2 widens, federated: true spans nested repos).
Before you change it — impact lists transitive callers (what could break) split from the tests that cover it (what to run). → "what breaks if I change charge()?"
What moves with it — get_temporal_context surfaces files that historically change together, plus churn and ownership. → "what usually changes with the auth config?"
Record what you learned — write_semantic_summary attaches a 1–3 sentence note to the node, committed to git for the next agent/teammate.
After editing — reindex_file refreshes the graph for the changed file.

You:    Refactor how we validate auth tokens — what's involved?

Agent:  → semantic_find("validate auth token")        → resolves auth.validate_jwt
        → get_context_profile(auth.validate_jwt)       → 3 callers, 2 callees (+ summaries)
        → impact(auth.validate_jwt)                    → 6 impacted functions, 4 covering tests
        → get_temporal_context("src/auth/jwt.py")      → co-changes with config/keys.py (conf 0.8)

Agent:  "validate_jwt is called by the login, refresh, and middleware paths;
         changing its signature touches 6 functions and 4 tests (test_jwt.py,
         test_middleware.py, …). Heads up: jwt.py historically changes together
         with config/keys.py — you'll likely need to update both."

🎥 A short screen recording is on the roadmap — see Documentation.

Agent skills

RepoSkein ships two cross-agent Agent Skills — npx skills add reposkein/reposkein --all installs both into Claude Code, Cursor, Codex, and 70+ agents:

reposkein-setup — installs RepoSkein in a repo and verifies it's running (binary → index → MCP reachability). Ask your agent to run it.
reposkein-graph-rag — teaches your agent when to use each tool (the loop above). reposkein-mcp init installs it automatically for Claude Code.

Supported languages

Language	Definitions	Imports → edges	Cross-file calls
Python	functions, classes, methods, nested defs, vars	✅ relative / absolute / aliased	import-resolved (`exact`)
TypeScript / TSX	classes, interfaces, enums, methods, arrows	✅ named / default / aliased / `* as ns`	import-resolved (`exact`)
JavaScript / JSX	(via the TS grammar)	✅ ES imports (no CommonJS yet)	import-resolved (`exact`)
Rust	fns, structs, traits, enums, `impl` methods	✅ `use` (groups, aliases, globs, `pub use` chains; workspace-aware)	import-resolved (`exact`)
Go	funcs, methods (`Type.method`), structs, interfaces	not yet (cross-package planned)	same-package (same-dir); cross-package by name
Java	classes, records, interfaces, enums, methods, constructors, fields	✅ package-path (no wildcard/static yet)	import-resolved (`exact`)
C#	classes, structs, records, interfaces, enums, methods, properties	not yet (cross-namespace planned)	same-dir; cross-namespace by name

What resolves — honestly. Every edge carries a resolution (exact / name_match / ambiguous) + confidence, so your agent knows what to trust. Same-file calls, self/this methods, and import-followed free-function calls resolve exact. Python module-alias calls (import foo as f; f.bar()) resolve exact to the target module's function. Cross-file INHERITS/IMPLEMENTS edges are resolved repo-wide: import-followed bases resolve exact (confidence 1.0); unique same-directory or repo-wide bases resolve name_match (0.8/0.7); ambiguous bases are skipped to avoid false hierarchy edges — and bases that live in a federated child repo are stitched into cross-repo heritage edges at load time. Go's struct/interface embedding (type Dog struct { Animal }) is captured as INHERITS. Constructors emit a distinct INSTANTIATES edge (new Foo() in TS/Java/C#, Foo { .. } and Foo::new() in Rust, Foo{} / &Foo{} composite literals in Go, and Python Foo() whose name resolves to a class) so an agent can ask who creates instances of this type — resolved against the type index and skipped when ambiguous. The graph is type-free by design (deterministic, no compiler in the loop), but it does track types where it can do so soundly from source alone: when a local is assigned a constructor (x = Foo(); x.bar()), that x.bar() resolves exact to Foo.bar (intraprocedural receiver typing). Method calls on receivers it can't trace that way (parameters, fields, return values) resolve by name (≤ name_match), and overloaded calls are flagged ambiguous. Go and C# don't emit import edges yet, so their cross-package/namespace calls resolve by name (same-package/-directory calls do resolve). These limits are inherent to the zero-infra, type-free design; a deeper optional type-aware layer (SCIP) is gated on benchmark evidence. Adding a language is a well-trodden path — contributions welcome.

How it works

 Your agent (Claude Code / Cursor / …)   ── guided by the reposkein skill
        │  MCP
        ▼
 @reposkein/mcp        semantic_find · get_context_profile · impact · get_temporal_context
   (TypeScript)        read_cypher · write_semantic_summary · init_cpg_skeleton · reindex_file
                       CLI: init · doctor · index · view
        │ reads
        ▼
 .reposkein/*.jsonl   ← the code graph, committed to git (zero-infra, in-memory store)
        ▲ writes
        │
 reposkein-indexer    Tree-sitter parse → stable IDs → canonical JSONL
   (Rust)             + git hooks & a 3-way merge driver for conflict-free summaries

Structure is static. The skeleton comes only from parsing — identical code produces a byte-identical graph (a CI-tested invariant), independent of who runs it.
Meaning is just-in-time. Summaries are written as the agent visits nodes; they're content-hash-stamped (so they flag stale when code changes) and committed to git.
Local-first. The committed JSONL is the source of truth; the optional Neo4j backend is a reconstructable projection most users never need.

Cross-repo federation

Got nested repositories (a monorepo of indexed repos)? RepoSkein discovers them, links them with FEDERATES_TO, and stitches cross-repo call, import, and heritage edges (INHERITS/IMPLEMENTS to a base in a child repo) at load time. Pass federated: true to traverse across repo boundaries. Federation edges are derived at load (never committed), so each repo stays independently deterministic.

Visualize the graph — the constellation viewer

reposkein-mcp view .          # opens http://127.0.0.1:<port> in your browser

view starts a local, read-only, zero-infra web app (React + three.js, bound to 127.0.0.1) that renders the committed .reposkein graph as an interactive 3D astronomy-style constellation. There's no Neo4j and no external service — it reads the committed JSONL directly and never mutates it. Try the live demo → (RepoSkein viewing its own multi-language graph).

The map is deterministic: a seeded force layout means the same graph always lays out the same way (cached in IndexedDB for instant reloads), and the layout is render-time only — it never touches the committed JSONL. Levels of detail map onto an astronomy metaphor — Repository → Directory → File → Symbol become galaxy → constellation → solar-system → star — so you zoom or click to expand a cluster (a brief supernova animation) and click a star to inspect it. Federation galaxies and agent-written summaries render when present.

Legible — per-edge-type colors + legend, importance-sized stars, adaptive labels, breadcrumb, per-language galaxy coloring, depth fog / bloom / nebula halos.
Edges encode resolution — color = edge type (CALLS/IMPORTS/INHERITS/IMPLEMENTS/INSTANTIATES), opacity = confidence (exact/name_match/ambiguous), and flow particles show call direction.
Analytical — one-click lenses (call graph / type hierarchy / imports / tests), an impact overlay (transitive callers + covering tests), a confidence-audit mode (see where the type-free resolver guesses), and a temporal-coupling overlay (git co-change).
Explorable — ranked search-to-fly, N-hop neighborhood focus, source peek in the detail panel (a path-guarded read-only file slice + an "Open in editor" vscode:// link), keyboard nav (/ search, f frame-all, arrows to hop neighbors, Esc back), a minimap, and PNG screenshot export.
Guided tour — a cinematic, deterministically-derived flythrough (overview → largest modules → busiest hub → type hierarchy → entry point) with captions.

reposkein-mcp view --export ./site .   # write a self-contained static site

--export bakes the graph into graph-data.js (as window.__REPOSKEIN_GRAPH__) and emits a self-contained static site — it works from file:// or any static host with no server, which is exactly how the live demo above is published. Handy for sharing a snapshot, embedding in docs, or a project landing page.

MCP tools

Tool	What it does
`semantic_find`	find where to start — rank functions/classes by meaning (lexical BM25F; optional embeddings)
`get_context_profile`	resolve a function/class → its caller/callee neighborhood as ready-to-read prose
`impact`	transitive callers split into impacted code vs covering tests
`get_temporal_context`	git-derived co-change, churn, and ownership for a file
`read_cypher`	read-only graph queries (writes rejected, results capped)
`write_semantic_summary`	attach a hash-stamped summary to a node
`init_cpg_skeleton`	build/rebuild the graph
`reindex_file`	refresh after editing a file

The reposkein-mcp CLI adds init (set up a repo), doctor (health check), index (rebuild the graph), and view (the constellation viewer; --export <dir> writes a self-contained static site).

Optional: semantic embeddings

By default semantic_find is deterministic and lexical (BM25F — zero-infra, no keys). You can opt into a hybrid tier (lexical + embedding cosine, fused via RRF) for fuzzier queries. It's default-off, vectors are cached in .reposkein/local/embeddings/ (gitignored, never committed), and it falls back to lexical automatically on any error. Set env vars on the MCP server and pick one:

A) Voyage AI — cloud, easiest, best for code

Get a key, then:

REPOSKEIN_EMBED_PROVIDER=voyage
VOYAGE_API_KEY=pa-...
# optional: REPOSKEIN_EMBED_MODEL=voyage-code-3   # default — code-specialized

Sends document strings (qualified names, signatures, summaries) to Voyage's API. Use B or C if you can't egress code.

B) Ollama — local, off-the-shelf, no key

ollama pull nomic-embed-text     # 768-dim (or mxbai-embed-large=1024, bge-m3=1024)

REPOSKEIN_EMBED_PROVIDER=http
REPOSKEIN_EMBED_URL=http://127.0.0.1:11434/v1/embeddings
REPOSKEIN_EMBED_MODEL=nomic-embed-text
REPOSKEIN_EMBED_DIMS=768          # must match the model

C) Voyage's open model, self-hosted — offline + Voyage quality

voyage-4-nano (Apache-2.0) is a custom Qwen3-based model Ollama can't run, so RepoSkein ships a prebuilt server. The image is published to GHCR — public and multi-arch (amd64/arm64) — so there's nothing to build:

docker run -p 8080:8080 -v reposkein-hf:/root/.cache/huggingface \
  ghcr.io/reposkein/reposkein-embed          # auto-picks your architecture; first run downloads the model

REPOSKEIN_EMBED_PROVIDER=http
REPOSKEIN_EMBED_URL=http://127.0.0.1:8080/v1/embeddings
REPOSKEIN_EMBED_MODEL=voyage-4-nano
REPOSKEIN_EMBED_DIMS=1024         # must equal the server's EMBED_DIMS

Everything stays on your machine. The image is CPU-only and runs with no NVIDIA GPU on Apple Silicon / ARM unified-memory, x64 Linux, and Windows (CI builds + smoke-tests both arches). Docker can't use Apple's Metal/MPS — for that, run the server natively with EMBED_DEVICE=mps. Full details (root docker compose up, GPU, other models): embed-server/README.md.

REPOSKEIN_EMBED_DIMS on the client must match the model's output dimension, or cosine scoring is skipped.

Optional: Neo4j backend

The zero-infra JSONL store is the default. Neo4j is an optional projection for very large graphs and raw Cypher at scale:

docker compose --profile neo4j up -d          # from the repo root
NEO4J_PASSWORD=reposkeintest reposkein-indexer load .

Then set REPOSKEIN_STORE=neo4j + the NEO4J_* env vars on the MCP server. (REPOSKEIN_STORE=auto, the default, uses JSONL when present and falls back to Neo4j only if configured.)

Benchmarks

Two tracks, both under mcp/bench/:

Track 1 — retrieval efficiency (deterministic, no LLM): RepoSkein vs a grep agent on hand-labeled tasks → mean ~8.4× fewer context tokens on structural queries, at F0.5 = 1.00 vs grep 0.11–0.71. Details.
Track 2 — end-task (SWE-bench-Verified): a minimal agent loop where the only difference is the navigation toolset (RepoSkein vs grep), graded on resolve-rate + tokens + turns. Built + unit-tested; the API+Docker run is opt-in.

Build from source

Requirements: Rust (stable), Node 24. Docker only for the optional Neo4j backend.

cd indexer && cargo build --release        # → indexer/target/release/reposkein-indexer
cd ../mcp  && npm install && npm run build

Wire it into your agent with command: node, args: [".../mcp/dist/index.js"], env REPOSKEIN_REPO_PATH + REPOSKEIN_INDEXER_BIN. Tests: cd indexer && cargo test && cargo clippy --all-targets -- -D warnings; cd mcp && npm test.

Repository layout

indexer/      Rust workspace: core, lang-{python,ts,rust,go,java,csharp}, lang-common, neo4j-io, cli
mcp/          @reposkein/mcp — the TypeScript MCP server (tools + graph-store backends)
mcp/bench/    benchmarks: retrieval efficiency (Track 1) + end-task SWE-bench harness (Track 2)
skills/       reposkein-graph-rag + reposkein-setup — cross-agent skills (skills.sh)
embed-server/ one-command local embedding server (voyage-4-nano) for hybrid semantic_find
viz/          @reposkein/viz — the 3D constellation viewer SPA (served by `reposkein-mcp view`)

Documentation

Doc	What's in it
`mcp/README.md`	the `@reposkein/mcp` package — tools, config, env vars
`viz/README.md`	the `@reposkein/viz` constellation viewer — architecture, dev/build
`embed-server/README.md`	the local embedding server — Docker/GHCR, platforms, GPU
`mcp/bench/README.md`	Track 1 retrieval benchmark — method + results
`mcp/bench/track2/README.md`	Track 2 end-task (SWE-bench) harness
`CHANGELOG.md`	release history (Keep a Changelog)
`skills/`	the two cross-agent skills

Contributing

Contributions are welcome — bug fixes, new languages, docs. See CONTRIBUTING.md for the dev setup, the determinism invariants you must preserve, and the step-by-step recipe for adding a new language (it's a well-trodden path — Go, Java, and C# were each added the same way). RepoSkein uses Conventional Commits and keeps CI green (determinism gates + clippy + tests).

Acknowledgements

Tree-sitter — the parsers behind every language extractor.
Model Context Protocol — the agent integration standard.
Voyage AI — voyage-code-3 and the open-weight voyage-4-nano powering the optional embeddings tier.
Discovery via Glama, skills.sh, mcpservers.org, and the awesome-mcp community lists.
README header by capsule-render + readme-typing-svg.

Contact

🐛 Bugs / features: open an issue
💬 Questions / ideas: GitHub Discussions

License

Apache-2.0.

mcp-reposkein