Skip to main content
Glama
reposkein

mcp-reposkein

Official

npm CI release License MCP

skills.sh  Glama  mcpservers.org  ghcr

๐Ÿ”ญ Live demo

๐Ÿ”ญ Live demo โ†’ โ€” RepoSkein's own graph, rendered as an interactive 3D constellation in your browser.

Introduction

RepoSkein gives your AI coding agent a map of your codebase โ€” so it navigates structure instead of grepping and guessing.

It uses Tree-sitter to build a deterministic Code Property Graph of your repo โ€” files, classes, functions, imports, and call edges โ€” and serves it to any MCP-capable agent (Claude Code, Cursor, Codex, โ€ฆ). As the agent works, it writes short natural-language summaries onto graph nodes; those summaries are versioned in git alongside the code, so an agent's understanding becomes shared team memory that the next agent โ€” or teammate โ€” starts from.

Who it's for: developers using AI coding agents on real, large, or nested/polyglot codebases, who are tired of the agent burning its context window on grep; and teams who want that hard-won understanding to persist and be shared rather than re-derived every session.

  • โšก Zero-infra โ€” no database, no Docker. The graph lives in committed .reposkein/*.jsonl files.

  • ๐Ÿ”’ Deterministic โ€” same code โ†’ byte-identical graph. No LLM in the construction path.

  • ๐ŸŒ 7 languages โ€” Python, TypeScript, JavaScript, Rust, Go, Java, C#.

  • ๐Ÿงฉ Local-first & git-native โ€” the graph and its summaries travel with your code.

Your agent asks

RepoSkein answers โ€” directly from the graph

"Who calls charge()?"

the exact callers, with one-line summaries

"What breaks if I change this?"

the impacted callers + the tests that cover them

"Where do I even start?"

ranked entry-point functions by meaning, not filename

"What usually changes with this file?"

co-change history from git

In a deterministic, no-LLM benchmark, RepoSkein surfaces the right functions with a mean ~8.4ร— fewer context tokens than a grep-based agent on structural queries.

Related MCP server: Octocode

Table of contents

Prerequisites

  • Node.js 18+ โ€” to run npx @reposkein/mcp (the indexer binary is fetched automatically).

  • An MCP-capable agent โ€” Claude Code, Cursor, Codex, Zed, etc.

  • A git repository to index (RepoSkein installs git hooks and commits the graph).

  • Optional: Docker (only for the embeddings server or the Neo4j backend); Rust (only to build from source).

Installation

In the repo you want your agent to understand:

npx @reposkein/mcp init

This downloads the indexer for your platform, installs git hooks + the navigation skill, builds the initial code graph, and prints an MCP config block. Then:

  1. Add the printed config to your agent (e.g. Claude Code's .mcp.json):

    {
      "mcpServers": {
        "reposkein": {
          "command": "reposkein-mcp",
          "env": { "REPOSKEIN_REPO_PATH": "/path/to/your/repo" }
        }
      }
    }
  2. Verify and commit the graph (init already built it):

    reposkein-mcp doctor .         # โœ“ binary  โœ“ indexed (N nodes)  โœ“ ready
    git add .reposkein && git commit -m "add RepoSkein code graph"

    Re-index after big changes with reposkein-mcp index . (or the agent's reindex_file tool).

  3. Ask your agent "what calls this function?" or "what breaks if I change X?" โ€” it answers from the graph.

Prefer to let your agent set it up? Install the skills and tell it to run the reposkein-setup skill โ€” it installs, indexes, and verifies everything:

npx skills add reposkein/reposkein --all

Platforms: prebuilt binaries for macOS (Apple Silicon), Linux (x64/arm64), and Windows (x64). Elsewhere, point REPOSKEIN_INDEXER_BIN at a from-source build.

Usage โ€” working with your agent

You ask in plain language; the bundled reposkein-graph-rag skill drives the tools. The natural loop:

  1. Find where to start โ€” semantic_find("jwt auth validation") ranks the right functions by meaning, no symbol name needed. โ†’ "where's the rate limiter?"

  2. Understand it โ€” get_context_profile returns the node's callers + callees as ready-to-read prose (hops: 2 widens, federated: true spans nested repos).

  3. Before you change it โ€” impact lists transitive callers (what could break) split from the tests that cover it (what to run). โ†’ "what breaks if I change charge()?"

  4. What moves with it โ€” get_temporal_context surfaces files that historically change together, plus churn and ownership. โ†’ "what usually changes with the auth config?"

  5. Record what you learned โ€” write_semantic_summary attaches a 1โ€“3 sentence note to the node, committed to git for the next agent/teammate.

  6. After editing โ€” reindex_file refreshes the graph for the changed file.

You:    Refactor how we validate auth tokens โ€” what's involved?

Agent:  โ†’ semantic_find("validate auth token")        โ†’ resolves auth.validate_jwt
        โ†’ get_context_profile(auth.validate_jwt)       โ†’ 3 callers, 2 callees (+ summaries)
        โ†’ impact(auth.validate_jwt)                    โ†’ 6 impacted functions, 4 covering tests
        โ†’ get_temporal_context("src/auth/jwt.py")      โ†’ co-changes with config/keys.py (conf 0.8)

Agent:  "validate_jwt is called by the login, refresh, and middleware paths;
         changing its signature touches 6 functions and 4 tests (test_jwt.py,
         test_middleware.py, โ€ฆ). Heads up: jwt.py historically changes together
         with config/keys.py โ€” you'll likely need to update both."

๐ŸŽฅ A short screen recording is on the roadmap โ€” see Documentation.

Agent skills

RepoSkein ships two cross-agent Agent Skills โ€” npx skills add reposkein/reposkein --all installs both into Claude Code, Cursor, Codex, and 70+ agents:

  • reposkein-setup โ€” installs RepoSkein in a repo and verifies it's running (binary โ†’ index โ†’ MCP reachability). Ask your agent to run it.

  • reposkein-graph-rag โ€” teaches your agent when to use each tool (the loop above). reposkein-mcp init installs it automatically for Claude Code.

Supported languages

Language

Definitions

Imports โ†’ edges

Cross-file calls

Python

functions, classes, methods, nested defs, vars

โœ… relative / absolute / aliased

import-resolved (exact)

TypeScript / TSX

classes, interfaces, enums, methods, arrows

โœ… named / default / aliased / * as ns

import-resolved (exact)

JavaScript / JSX

(via the TS grammar)

โœ… ES imports (no CommonJS yet)

import-resolved (exact)

Rust

fns, structs, traits, enums, impl methods

โœ… use (groups, aliases, globs, pub use chains; workspace-aware)

import-resolved (exact)

Go

funcs, methods (Type.method), structs, interfaces

not yet (cross-package planned)

same-package (same-dir); cross-package by name

Java

classes, records, interfaces, enums, methods, constructors, fields

โœ… package-path (no wildcard/static yet)

import-resolved (exact)

C#

classes, structs, records, interfaces, enums, methods, properties

not yet (cross-namespace planned)

same-dir; cross-namespace by name

What resolves โ€” honestly. Every edge carries a resolution (exact / name_match / ambiguous) + confidence, so your agent knows what to trust. Same-file calls, self/this methods, and import-followed free-function calls resolve exact. Python module-alias calls (import foo as f; f.bar()) resolve exact to the target module's function. Cross-file INHERITS/IMPLEMENTS edges are resolved repo-wide: import-followed bases resolve exact (confidence 1.0); unique same-directory or repo-wide bases resolve name_match (0.8/0.7); ambiguous bases are skipped to avoid false hierarchy edges โ€” and bases that live in a federated child repo are stitched into cross-repo heritage edges at load time. Go's struct/interface embedding (type Dog struct { Animal }) is captured as INHERITS. Constructors emit a distinct INSTANTIATES edge (new Foo() in TS/Java/C#, Foo { .. } and Foo::new() in Rust, Foo{} / &Foo{} composite literals in Go, and Python Foo() whose name resolves to a class) so an agent can ask who creates instances of this type โ€” resolved against the type index and skipped when ambiguous. The graph is type-free by design (deterministic, no compiler in the loop), but it does track types where it can do so soundly from source alone: when a local is assigned a constructor (x = Foo(); x.bar()), that x.bar() resolves exact to Foo.bar (intraprocedural receiver typing). Method calls on receivers it can't trace that way (parameters, fields, return values) resolve by name (โ‰ค name_match), and overloaded calls are flagged ambiguous. Go and C# don't emit import edges yet, so their cross-package/namespace calls resolve by name (same-package/-directory calls do resolve). These limits are inherent to the zero-infra, type-free design; a deeper optional type-aware layer (SCIP) is gated on benchmark evidence. Adding a language is a well-trodden path โ€” contributions welcome.

How it works

 Your agent (Claude Code / Cursor / โ€ฆ)   โ”€โ”€ guided by the reposkein skill
        โ”‚  MCP
        โ–ผ
 @reposkein/mcp        semantic_find ยท get_context_profile ยท impact ยท get_temporal_context
   (TypeScript)        read_cypher ยท write_semantic_summary ยท init_cpg_skeleton ยท reindex_file
                       CLI: init ยท doctor ยท index ยท view
        โ”‚ reads
        โ–ผ
 .reposkein/*.jsonl   โ† the code graph, committed to git (zero-infra, in-memory store)
        โ–ฒ writes
        โ”‚
 reposkein-indexer    Tree-sitter parse โ†’ stable IDs โ†’ canonical JSONL
   (Rust)             + git hooks & a 3-way merge driver for conflict-free summaries
  • Structure is static. The skeleton comes only from parsing โ€” identical code produces a byte-identical graph (a CI-tested invariant), independent of who runs it.

  • Meaning is just-in-time. Summaries are written as the agent visits nodes; they're content-hash-stamped (so they flag stale when code changes) and committed to git.

  • Local-first. The committed JSONL is the source of truth; the optional Neo4j backend is a reconstructable projection most users never need.

Cross-repo federation

Got nested repositories (a monorepo of indexed repos)? RepoSkein discovers them, links them with FEDERATES_TO, and stitches cross-repo call, import, and heritage edges (INHERITS/IMPLEMENTS to a base in a child repo) at load time. Pass federated: true to traverse across repo boundaries. Federation edges are derived at load (never committed), so each repo stays independently deterministic.

Visualize the graph โ€” the constellation viewer

reposkein-mcp view .          # opens http://127.0.0.1:<port> in your browser

view starts a local, read-only, zero-infra web app (React + three.js, bound to 127.0.0.1) that renders the committed .reposkein graph as an interactive 3D astronomy-style constellation. There's no Neo4j and no external service โ€” it reads the committed JSONL directly and never mutates it. Try the live demo โ†’ (RepoSkein viewing its own multi-language graph).

The map is deterministic: a seeded force layout means the same graph always lays out the same way (cached in IndexedDB for instant reloads), and the layout is render-time only โ€” it never touches the committed JSONL. Levels of detail map onto an astronomy metaphor โ€” Repository โ†’ Directory โ†’ File โ†’ Symbol become galaxy โ†’ constellation โ†’ solar-system โ†’ star โ€” so you zoom or click to expand a cluster (a brief supernova animation) and click a star to inspect it. Federation galaxies and agent-written summaries render when present.

  • Legible โ€” per-edge-type colors + legend, importance-sized stars, adaptive labels, breadcrumb, per-language galaxy coloring, depth fog / bloom / nebula halos.

  • Edges encode resolution โ€” color = edge type (CALLS/IMPORTS/INHERITS/IMPLEMENTS/INSTANTIATES), opacity = confidence (exact/name_match/ambiguous), and flow particles show call direction.

  • Analytical โ€” one-click lenses (call graph / type hierarchy / imports / tests), an impact overlay (transitive callers + covering tests), a confidence-audit mode (see where the type-free resolver guesses), and a temporal-coupling overlay (git co-change).

  • Explorable โ€” ranked search-to-fly, N-hop neighborhood focus, source peek in the detail panel (a path-guarded read-only file slice + an "Open in editor" vscode:// link), keyboard nav (/ search, f frame-all, arrows to hop neighbors, Esc back), a minimap, and PNG screenshot export.

  • Guided tour โ€” a cinematic, deterministically-derived flythrough (overview โ†’ largest modules โ†’ busiest hub โ†’ type hierarchy โ†’ entry point) with captions.

reposkein-mcp view --export ./site .   # write a self-contained static site

--export bakes the graph into graph-data.js (as window.__REPOSKEIN_GRAPH__) and emits a self-contained static site โ€” it works from file:// or any static host with no server, which is exactly how the live demo above is published. Handy for sharing a snapshot, embedding in docs, or a project landing page.

MCP tools

Tool

What it does

semantic_find

find where to start โ€” rank functions/classes by meaning (lexical BM25F; optional embeddings)

get_context_profile

resolve a function/class โ†’ its caller/callee neighborhood as ready-to-read prose

impact

transitive callers split into impacted code vs covering tests

get_temporal_context

git-derived co-change, churn, and ownership for a file

read_cypher

read-only graph queries (writes rejected, results capped)

write_semantic_summary

attach a hash-stamped summary to a node

init_cpg_skeleton

build/rebuild the graph

reindex_file

refresh after editing a file

The reposkein-mcp CLI adds init (set up a repo), doctor (health check), index (rebuild the graph), and view (the constellation viewer; --export <dir> writes a self-contained static site).

Optional: semantic embeddings

By default semantic_find is deterministic and lexical (BM25F โ€” zero-infra, no keys). You can opt into a hybrid tier (lexical + embedding cosine, fused via RRF) for fuzzier queries. It's default-off, vectors are cached in .reposkein/local/embeddings/ (gitignored, never committed), and it falls back to lexical automatically on any error. Set env vars on the MCP server and pick one:

A) Voyage AI โ€” cloud, easiest, best for code

Get a key, then:

REPOSKEIN_EMBED_PROVIDER=voyage
VOYAGE_API_KEY=pa-...
# optional: REPOSKEIN_EMBED_MODEL=voyage-code-3   # default โ€” code-specialized

Sends document strings (qualified names, signatures, summaries) to Voyage's API. Use B or C if you can't egress code.

B) Ollama โ€” local, off-the-shelf, no key

ollama pull nomic-embed-text     # 768-dim (or mxbai-embed-large=1024, bge-m3=1024)
REPOSKEIN_EMBED_PROVIDER=http
REPOSKEIN_EMBED_URL=http://127.0.0.1:11434/v1/embeddings
REPOSKEIN_EMBED_MODEL=nomic-embed-text
REPOSKEIN_EMBED_DIMS=768          # must match the model

C) Voyage's open model, self-hosted โ€” offline + Voyage quality

voyage-4-nano (Apache-2.0) is a custom Qwen3-based model Ollama can't run, so RepoSkein ships a prebuilt server. The image is published to GHCR โ€” public and multi-arch (amd64/arm64) โ€” so there's nothing to build:

docker run -p 8080:8080 -v reposkein-hf:/root/.cache/huggingface \
  ghcr.io/reposkein/reposkein-embed          # auto-picks your architecture; first run downloads the model
REPOSKEIN_EMBED_PROVIDER=http
REPOSKEIN_EMBED_URL=http://127.0.0.1:8080/v1/embeddings
REPOSKEIN_EMBED_MODEL=voyage-4-nano
REPOSKEIN_EMBED_DIMS=1024         # must equal the server's EMBED_DIMS

Everything stays on your machine. The image is CPU-only and runs with no NVIDIA GPU on Apple Silicon / ARM unified-memory, x64 Linux, and Windows (CI builds + smoke-tests both arches). Docker can't use Apple's Metal/MPS โ€” for that, run the server natively with EMBED_DEVICE=mps. Full details (root docker compose up, GPU, other models): embed-server/README.md.

REPOSKEIN_EMBED_DIMS on the client must match the model's output dimension, or cosine scoring is skipped.

Optional: Neo4j backend

The zero-infra JSONL store is the default. Neo4j is an optional projection for very large graphs and raw Cypher at scale:

docker compose --profile neo4j up -d          # from the repo root
NEO4J_PASSWORD=reposkeintest reposkein-indexer load .

Then set REPOSKEIN_STORE=neo4j + the NEO4J_* env vars on the MCP server. (REPOSKEIN_STORE=auto, the default, uses JSONL when present and falls back to Neo4j only if configured.)

Benchmarks

Two tracks, both under mcp/bench/:

  • Track 1 โ€” retrieval efficiency (deterministic, no LLM): RepoSkein vs a grep agent on hand-labeled tasks โ†’ mean ~8.4ร— fewer context tokens on structural queries, at F0.5 = 1.00 vs grep 0.11โ€“0.71. Details.

  • Track 2 โ€” end-task (SWE-bench-Verified): a minimal agent loop where the only difference is the navigation toolset (RepoSkein vs grep), graded on resolve-rate + tokens + turns. Built + unit-tested; the API+Docker run is opt-in.

Build from source

Requirements: Rust (stable), Node 24. Docker only for the optional Neo4j backend.

cd indexer && cargo build --release        # โ†’ indexer/target/release/reposkein-indexer
cd ../mcp  && npm install && npm run build

Wire it into your agent with command: node, args: [".../mcp/dist/index.js"], env REPOSKEIN_REPO_PATH + REPOSKEIN_INDEXER_BIN. Tests: cd indexer && cargo test && cargo clippy --all-targets -- -D warnings; cd mcp && npm test.

Repository layout

indexer/      Rust workspace: core, lang-{python,ts,rust,go,java,csharp}, lang-common, neo4j-io, cli
mcp/          @reposkein/mcp โ€” the TypeScript MCP server (tools + graph-store backends)
mcp/bench/    benchmarks: retrieval efficiency (Track 1) + end-task SWE-bench harness (Track 2)
skills/       reposkein-graph-rag + reposkein-setup โ€” cross-agent skills (skills.sh)
embed-server/ one-command local embedding server (voyage-4-nano) for hybrid semantic_find
viz/          @reposkein/viz โ€” the 3D constellation viewer SPA (served by `reposkein-mcp view`)

Documentation

Doc

What's in it

mcp/README.md

the @reposkein/mcp package โ€” tools, config, env vars

viz/README.md

the @reposkein/viz constellation viewer โ€” architecture, dev/build

embed-server/README.md

the local embedding server โ€” Docker/GHCR, platforms, GPU

mcp/bench/README.md

Track 1 retrieval benchmark โ€” method + results

mcp/bench/track2/README.md

Track 2 end-task (SWE-bench) harness

CHANGELOG.md

release history (Keep a Changelog)

skills/

the two cross-agent skills

Contributing

Contributions are welcome โ€” bug fixes, new languages, docs. See CONTRIBUTING.md for the dev setup, the determinism invariants you must preserve, and the step-by-step recipe for adding a new language (it's a well-trodden path โ€” Go, Java, and C# were each added the same way). RepoSkein uses Conventional Commits and keeps CI green (determinism gates + clippy + tests).

Acknowledgements

Contact

License

Apache-2.0.

Install Server
A
license - permissive license
A
quality
A
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
0dRelease cycle
9Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/reposkein/reposkein'

If you have feedback or need assistance with the MCP directory API, please join our Discord server