How do I use Harpyja?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Harpyja where is the retry/backoff logic for the payment gateway?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Harpyja

by DCSTOLF

Overview Schema Related Servers Score Discussions

Python

Local

Harpyja

⚠️ Experimental — not production-ready
This project is entirely experimental. It is a research/work-in-progress prototype: APIs, schemas, defaults, and behavior change without notice; the documented hardware footprint is not validated (see the caveat below); and real-data evaluation is ongoing and indicative_only. Do not depend on it for anything you can't afford to have break. Use at your own risk.

A precision code-retrieval MCP server for coding agents working in large, legacy, and air-gapped codebases.

Harpyja is a Model Context Protocol server with one job: given a natural-language query, find the exact files and line ranges a coding agent needs across millions of lines of code — without the agent burning its own context window on blind searches.

It is named after Harpia harpyja, the harpy eagle: an apex hunter that locks onto a target in dense canopy and does not miss.

agent ──"where is the retry/backoff logic for the payment gateway?"──▶ Harpyja
                                                                          │
                                       ┌──────────────────────────────────┘
                                       ▼
                          Tier 0  deterministic (AST + ripgrep)
                          Tier 1  Scout      (native tool-calling explorer loop)
                          Tier 2  Deep        (recursive LM, on escalation)
                                                                          │
agent ◀──  src/billing/gateway.py:212-241  ◀──────────────────────────────┘
           tests/test_gateway.py:88-103

Related MCP server: Satori

Why

Coding agents are good at editing code and bad at finding it in repositories that are too big to fit in context. The usual failure mode is the agent spending half its context window grepping around, then reasoning over a polluted history. This is worse in the codebases that need it most: decades of patched proprietary code, no surviving institutional knowledge, no clean README, and nothing that ever entered an LLM's training set.

Harpyja externalizes retrieval into a dedicated subsystem that does it cheaply and precisely, then hands back compact citations. Not every problem needs a million-token context window. Sometimes you need a small, brutally specialized subsystem that does one thing extremely well.

What it is (and isn't)

It is a read-only locator. It returns file:line citations and short rationales.
It is not an editor, a RAG chatbot, or a code generator. It never modifies your repository.
It runs offline. Everything — model, search, parsing — stays on the local machine. No telemetry, no external calls. Suitable for fully air-gapped environments and proprietary code that must stay proprietary.
It fits a modest box. Everything runs against a local OpenAI-compatible endpoint (llama.cpp or Ollama). The default model is hf.co/Qwen/Qwen3-8B-GGUF:latest, which serves both the Scout explorer loop and the Deep tier; any OpenAI-compatible tool-calling model works and is swappable from the eval CLI (--scout-model / --deep-model), harpyja.toml, or HARPYJA_*.
⚠️ Footprint not yet validated. An 8B-class model co-loaded with the Deep model and the Deno/Pyodide sandbox under mode=auto exceeds a small GPU, so treat "8 GB" as an aspirational target rather than a validated minimum. The verification-gate judge scores citations with lm_model by default (verify_method=instruct_model), keeping the finder and the scorer as separate concerns.

How it works

Harpyja is a three-tier locator with cost-based escalation:

Tier	Engine	Role	Speed
0	Tree-sitter symbol index + ripgrep	Deterministic prefilter and exact-symbol lookups	instant
1	Scout — a native tool-calling explorer loop (read-only `grep`/`glob`/`read_span` → `submit_citations`)	The default. Handles most "where is X" queries	fast
2	Deep — a Recursive Language Model (`dspy.RLM`) over bounded host tools	Escalation path for broad/trace/audit queries	slower, thorough

The Orchestrator runs the cheapest tier that can answer, verifies the result by reading the cited lines back, and only escalates when verification fails or the query shape demands it. See ARCHITECTURE.md for the full design and SPEC.md for the contracts.

Tier 1 Scout is a native explorer loop Harpyja owns end-to-end: a general tool-calling model driven over three read-only tools (grep/glob/read_span) to a submit_citations result, behind a stable, swappable backend seam. Tier 2 reimplements the dspy.RLM approach demonstrated by megacode, which serves only as reference and inspiration (not a dependency). Around both, Harpyja adds the language-agnostic indexing, symbol layer, routing, verification, and MCP surface that turn them into a reusable locator.

Note: Earlier versions ran Scout on Microsoft FastContext (a fine-tuned 4B finder wrapped as a pinned dependency); it was retired when its upstream model became unobtainable, and replaced by the self-contained explorer loop above — no external finder dependency, model-agnostic over whatever the local endpoint serves.

MCP tools

Harpyja exposes a deliberately tiny surface:

harpyja_locate(query, repo_path, mode="auto", max_results=8) → ranked file:line citations with rationales.
harpyja_read(path, start, end) → a bounded code snippet (for remote/air-gapped repos the agent can't read directly).
harpyja_index(repo_path, refresh=false) → build/refresh the manifest and symbol index ahead of time.

mode is one of auto | fast | deep. In auto, the Orchestrator decides which tiers to run.

Supported languages (symbol layer)

Tree-sitter symbol extraction ships for Go, Rust, Python, JavaScript/TypeScript, C#, Java, and C/C++. Any other language — or a file that fails to parse — degrades gracefully to ripgrep, so Harpyja never goes blind on an unknown file type.

Requirements

Python 3.12+
ripgrep (rg) on PATH
Deno (the dspy.RLM sandbox runs on Deno/Pyodide WASM — installed once, runs locally)
A local OpenAI-compatible model endpoint: llama.cpp (llama-server) or Ollama
Optional: a CUDA/Metal GPU (the default profile targets a modest local GPU; an 8B-class model serving Scout + Deep with the WASM sandbox co-loaded under mode=auto means 8 GB is not a validated minimum yet)

Install

git clone <your-fork>/harpyja
cd harpyja
uv sync            # or: pip install -e .

Serve a model (pick one)

Ollama

ollama serve
ollama pull <4b-instruct-model>
export HARPYJA_LM_API_BASE="http://localhost:11434/v1"
export HARPYJA_LM_MODEL="<4b-instruct-model>"

llama.cpp

llama-server -m ./models/<model>.gguf --port 8000 --ctx-size 8192
export HARPYJA_LM_API_BASE="http://localhost:8000/v1"
export HARPYJA_LM_MODEL="local"

Wire it into your agent

Claude Code (.mcp.json in your project, or claude mcp add):

{
  "mcpServers": {
    "harpyja": {
      "command": "uv",
      "args": ["run", "harpyja", "serve", "--stdio"],
      "env": { "HARPYJA_LM_API_BASE": "http://localhost:11434/v1" }
    }
  }
}

Codex (~/.codex/config.toml):

[mcp_servers.harpyja]
command = "uv"
args = ["run", "harpyja", "serve", "--stdio"]
env = { HARPYJA_LM_API_BASE = "http://localhost:11434/v1" }

Both speak MCP over stdio. Harpyja also supports streamable HTTP (harpyja serve --http --port 9000) for shared or containerized deployments.

Quick start

# One-time (optional) index for faster first query
uv run harpyja index --repo ~/dev/legacy-monolith

# Ask from the CLI (same path the MCP tool uses)
uv run harpyja locate --repo ~/dev/legacy-monolith \
  --query "where do we validate inbound webhook signatures?" \
  --mode auto

Then, inside Claude Code or Codex, just ask naturally — the agent will call harpyja_locate on its own.

Configuration

Settings load from harpyja.toml (project root) with environment-variable overrides (HARPYJA_*). See SPEC.md for the full table. Common knobs: model endpoints, escalation thresholds, per-tier token budgets, language toggles, search bounds, and the outbound model-call timeout (lm_http_timeout_s, default 120 s — bounds each Gateway HTTP call so a stalled local endpoint degrades instead of hanging).

Project status

Early. The tiers are designed to land incrementally — see IMPLEMENTATION_PLAN.md. Harpyja stays useful at every wave: even Wave 1 (deterministic AST + ripgrep, no model) is a working locator.

Note: The Scout tier sits behind a stable backend seam, so its finder model or runtime can be swapped without touching the orchestrator, verification gate, or the rest of the stack.

License

MIT. Builds on MIT/permissively-licensed upstreams (DSPy, tree-sitter, ripgrep).

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DCSTOLF/harpyja'

If you have feedback or need assistance with the MCP directory API, please join our Discord server