Skip to main content
Glama
aimasteracc

tree-sitter-analyzer

by aimasteracc

đŸŒŗ Tree-sitter Analyzer

English | æ—ĨæœŦčĒž | įŽ€äŊ“中文

The MCP code-intelligence server for AI agents — fewer tokens, fewer tool calls, 100 % local. Pre-indexed AST cache + 62 MCP tools + 13 curated agent skills + TOON-compressed output. Beats CodeGraph on 6-repo head-to-head median (−11 % cost vs CodeGraph's −4 %), with a strict CLI superset. Now with BM25-ranked symbol search across all 62 tools — results sorted by relevance, not file path.

PyPI Python Version License Tests Coverage GitHub Stars


Get Started

One-line install for Claude Code:

claude mcp add tree-sitter-analyzer \
  --env TREE_SITTER_PROJECT_ROOT="$PWD" \
  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

Restart your agent, then say: "Set the project root to my repo and run codegraph_status."

Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) →


Why Tree-sitter Analyzer

  • Token-efficient by default. Every MCP response uses TOON — a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.

  • Verdict envelopes. Every response carries verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND, so orchestrators branch on outcomes without re-prompting.

  • Project health grading (A–F). No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / structure / git-hotspots in one call.

  • 13 curated workflows (Skills). Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.

  • 5 layers of safety. safe_to_edit + modification_guard + constraint DSL + change_impact + verdict envelopes — designed so agents know before they touch.

  • Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks. See below.


Benchmark Results

Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm — indicative, not statistically settled.

Codebase

Lang / files

Baseline

CodeGraph

TSA

Winner

Gin

Go / 99

$0.164

$0.094 (−43 %)

$0.080 (−51 %)

TSA ⭐

Alamofire

Swift / 98

$0.201

$0.219 (+9 %)

$0.147 (−27 %)

TSA ⭐

Excalidraw

TS / 603

$0.204

$0.179 (−12 %)

$0.212 (+4 %)

CodeGraph

Django

Py / 2 910

$0.162

$0.106 (−35 %)

$0.205 (+27 %)

CodeGraph

Tokio

Rust / 778

$0.214

$0.285 (+33 %)

$0.303 (+42 %)

both lose

OkHttp

Java / 596

$0.169

$0.200 (+18 %)

$0.178 (+5 %)

both lose

Median Δ vs baseline

−4 %

−11 %

TSA

TSA wins outright on 2 of 6 repos, has a lower median cost saving (−11 %), and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.

Why the median diverges from CodeGraph's published −35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.md for raw envelopes + reproducer scripts.

Post-benchmark improvements (2026-05-30): (1) BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank — a 133× speedup in semantic search. (2) Min-max BM25 normalization: relevance_score now properly differentiates strong matches (1.0) from weak (0.0) across all search paths. (3) semantic().sort(by='confidence') now works end-to-end. These improvements were not in the benchmark run; repos with large symbol counts (Django, Excalidraw) should see improved token efficiency in re-runs.


Key Features

Pre-indexed code intelligence (CodeGraph parity + superset)

Capability

TSA tool

Status

Symbol search (FTS5 + BM25 ranked)

codegraph_symbol_search

ahead — results sorted by relevance score, not file path

Go-to-def / find-refs / call hierarchy in one call

codegraph_navigate

PRIMARY entry point

Bulk-fetch N related symbols + relationship map

codegraph_explore

parity

Function-level blast radius + risk score

codegraph_impact

parity + risk score

Who-calls-X / what-X-calls

codegraph_callers / codegraph_callees

parity

Index health at-a-glance (+ edge count)

codegraph_status

ahead — reports total_edges for graph density signal

Pre-built call graph cache

codegraph_autoindex / codegraph_full_index / codegraph_incremental_sync

parity

Tests affected by a change (CLI)

--affected FILE...

parity

Tree-sitter Analyzer exclusive

Capability

TSA tool

Note

BM25-ranked symbol search

all search tools

relevance_score on every result (min-max normalized: best=1.0, weakest=0.0); sort(by='confidence') in DSL

Semantic search (133× faster)

codegraph_query semantic()

BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank

Project A–F health grading

check_project_health

7 dimensions (size/complexity/deps/coverage/duplication/structure/git-hotspot), no competitor offers this

TOON output

every tool, output_format: "toon" (default)

50-70 % token saving

Verdict envelopes

every tool

SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND

Safe-to-edit gate

safe_to_edit + modification_guard

refuses high-risk edits before they happen

Architectural constraint DSL

check_constraints

"module A cannot import B" → enforced

Code health (file-level)

check_file_health

block/long-method/smell detection

Class hierarchy

codegraph_class_hierarchy

type-inheritance tree

Dependency matrix

codegraph_dependency_matrix

module-coupling matrix

Dead code

codegraph_dead_code

transitive unreachable analysis

Complexity heatmap

codegraph_complexity_heatmap

per-fn cyclomatic + project view

AST-structural clone detection

codegraph_similarity

beyond text similarity

Mermaid call-graph export

codegraph_visualize

paste-ready in docs

UML Mermaid export

codegraph_uml

class / package / component / sequence diagrams

PR review

codegraph_pr_review

AST-diff + semantic classify + blast radius

agent_summary

every response

next-step hint baked into the envelope

Synapse cross-file resolver

internal

import-aware, beats regex guessing

Temporal activation

symbol_lineage

per-symbol git-modification frequency

One-shot file orientation

smart_context

health + exports + deps + edit-risk in one call (replaces 3-4 calls)

Architectural decision journal

decision_journal

persists reasoning across sessions — no competitor exposes this

Skills (13 curated workflows)

CodeGraph has zero skills. We ship 13 under .claude/skills/tsa-*/:

tsa-landing, tsa-find, tsa-graph, tsa-structure, tsa-deps, tsa-index, tsa-health-watch, tsa-edit-safety, tsa-edit-then-verify, tsa-constraints, tsa-pr-review, tsa-refactor-queue, tsa-temporal.

Each skill ships an allowed-tools subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 62 tools on every question.

255 CLI flags

Strict superset of CodeGraph's 15-command CLI. Highlights:

tree-sitter-analyzer --table full <file>          # method/signature/complexity table
tree-sitter-analyzer --partial-read --start-line N --end-line M <file>
tree-sitter-analyzer --project-health             # A-F grade across the project
tree-sitter-analyzer --callers <symbol>           # who-calls
tree-sitter-analyzer --codegraph-impact <fn>      # blast radius + risk
tree-sitter-analyzer --affected <file...>         # tests transitively affected
tree-sitter-analyzer --dead-code                  # transitive unreachable
tree-sitter-analyzer --check-constraints          # architectural rules
tree-sitter-analyzer --safe-to-edit <file>        # refuse if risky
tree-sitter-analyzer --uml class                  # Mermaid UML class diagram

See docs/CODEMAPS/cli.md for the full surface.


Quick Start

1. Install dependencies

# uv (required)
curl -LsSf https://astral.sh/uv/install.sh | sh        # macOS / Linux
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"  # Windows

# fd + ripgrep (required for search)
brew install fd ripgrep                                # macOS
winget install sharkdp.fd BurntSushi.ripgrep.MSVC      # Windows

2. Install Tree-sitter Analyzer

uv add "tree-sitter-analyzer[all,mcp]"

3. Hook it into your agent

See Supported Agents. Most clients want this MCP server entry:

{
  "mcpServers": {
    "tree-sitter-analyzer": {
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
    }
  }
}

After restart: "Set the project root to my repo and call codegraph_status."


How It Works

Source code → tree-sitter parse → SQLite + FTS5 index (.ast-cache/index.db)
                                         ↓
        codegraph_navigate / codegraph_explore / codegraph_callers / ...
                                         ↓
                            TOON-compressed envelope
                            (verdict + agent_summary + data)
                                         ↓
                              MCP client / CLI consumer

The index is built lazily on first query, refreshed on file change via a content-hash diff (codegraph_incremental_sync). All 62 tools read from the same .ast-cache/, so a query and its follow-up share work.


Supported Agents

claude mcp add tree-sitter-analyzer \
  --env TREE_SITTER_PROJECT_ROOT="$PWD" \
  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

Verify: claude mcp list. The 13 tsa-* skills auto-discover from .claude/skills/.

Edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\, Linux: ~/.config/Claude/):

{
  "mcpServers": {
    "tree-sitter-analyzer": {
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
    }
  }
}

Create .vscode/mcp.json (note: servers, not mcpServers):

{
  "servers": {
    "tree-sitter-analyzer": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }
    }
  }
}

All read the same mcpServers schema as Claude Desktop. Cursor: Settings → MCP. Cline: MCP panel → Edit settings. Continue: ~/.continue/config.json under experimental.modelContextProtocolServers. Roo Code: MCP panel → Edit MCP Settings.

âš ī¸ TREE_SITTER_PROJECT_ROOT must be absolute. The server enforces a security boundary against escapes via SecurityBoundaryManager.


Supported Languages

21 language plugins; 13 fully wired into the indexer (full symbol + call graph) + 5 (data/markup) reachable via the single-file CLI path + 3 scaffold (plugin exists, indexer wiring pending). The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.

Tier

Languages

Full index + symbol + call graph

Python ¡ Java ¡ JavaScript ¡ TypeScript ¡ Go ¡ Rust ¡ C ¡ C++ ¡ C# ¡ Swift ¡ Kotlin ¡ Ruby ¡ PHP

Single-file analysis (CLI)

HTML ¡ CSS ¡ Markdown ¡ SQL ¡ YAML

Scaffold (plugin exists, indexer wiring pending)

bash ¡ scala ¡ json

CodeGraph supports a similar set; the only popular code languages neither tool ships yet are Dart, Vue, Svelte, Lua (next-sprint backlog).


Configuration

Mostly nothing. The defaults are designed so you can hook it into your agent and forget:

  • Output format: TOON. Override per-call with output_format: "json".

  • Project root: TREE_SITTER_PROJECT_ROOT (env var, MCP) or --project-root (CLI).

  • Cache location: <project>/.ast-cache/. Safe to delete — auto-rebuilds.

  • Optional: TREE_SITTER_OUTPUT_PATH for large-output write target.


Quality & Testing

Metric

Value

Tests passed

18,702 ✅

Coverage

Coverage

Type safety

100 % mypy

Platforms

macOS ¡ Linux ¡ Windows

Pre-commit gates

bandit ¡ mypy ¡ pyupgrade ¡ detect-secrets ¡ codemap-sync ¡ smell-ratchet

uv run pytest -q                                # full suite
uv run python check_quality.py --new-code-only  # quality gate

Troubleshooting

Symptom

Fix

unsupported language on .swift / .kt / .rb / .php / .cs

Update to â‰Ĩ 1.12.x — the 5-language gap was patched in commit 50e99a8f.

MCP server doesn't appear in client

TREE_SITTER_PROJECT_ROOT must be absolute; restart the client after config edit.

database is locked

Stop any other process holding .ast-cache/index.db; if persistent, rm -rf .ast-cache && tree-sitter-analyzer --autoindex.

Slow first call

First call builds the index. Subsequent calls are sub-second. Run --full-index upfront to amortise.

Agent picks the wrong tool

Use a tsa-* skill (/tsa-graph, /tsa-find, ...) — each skill restricts the visible tool set to one workflow.


Development

git clone https://github.com/aimasteracc/tree-sitter-analyzer.git
cd tree-sitter-analyzer
uv sync --extra all --extra mcp
uv run pytest -q

See docs/CONTRIBUTING.md for the development guide.


Contributing & License

  • ⭐ A GitHub star helps surface this tool to other AI-agent users.

  • 💖 Sponsor — supports continued MCP / Skills development.

  • Lead sponsor: @o93.

  • MIT licensed — see LICENSE.

  • Release history: CHANGELOG.md.

A
license - permissive license
-
quality - not tested
-
maintenance - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aimasteracc/tree-sitter-analyzer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server