Skip to main content
Glama
Jakedismo

CodeGraph CLI MCP Server

by Jakedismo
AGENT_PROMPT_TIERS.md4.27 kB
# CodeGraph agentic tools: 4-tier prompt system This document explains how CodeGraph’s built-in agent chooses prompt “tiers” (verbosity/strategy) based on your configured LLM context window, and why this makes small local models viable. ## What “tier” means in CodeGraph When you call an agentic MCP tool (e.g. `agentic_code_search`), CodeGraph runs a server-side agent that: 1. Uses graph tools against SurrealDB (semantic search, dependency tracing, call-chain tracing, hotspots, etc.) 2. Synthesizes a final answer (and structured pinpoint references) using your configured LLM CodeGraph picks a context tier based on `llm.context_window` (or `CODEGRAPH_CONTEXT_WINDOW`) and then selects: - A tier-appropriate system prompt (Terse/Balanced/Detailed/Exploratory) - A recommended tool/step budget per analysis type - Retrieval and over-retrieval limits to avoid “too much context” and MCP output caps ## The 4 tiers (Small / Medium / Large / Massive) Tier detection is purely based on the configured context window: - **Small**: `0..=50_000` - **Medium**: `50_001..=150_000` - **Large**: `150_001..=500_000` - **Massive**: `>500_000` (e.g. multi-hundred-K to 2M context window models) Where CodeGraph reads it from: - `CODEGRAPH_CONTEXT_WINDOW` env var (highest priority), else - `llm.context_window` from config loaded via `ConfigManager::load()` ## What changes per tier ### Prompt verbosity Each analysis type has 4 prompt variants, selected via: - Small → **Terse** - Medium → **Balanced** - Large → **Detailed** - Massive → **Exploratory** This selection happens in `crates/codegraph-mcp-server/src/prompt_selector.rs`. ### Recommended max steps The agent uses a base max step count by tier: - Small: 5 - Medium: 10 - Large: 15 - Massive: 20 Then it applies an analysis-type multiplier (e.g. architecture analysis tends to get a larger budget than code search). ### Retrieval limits (and MCP-safe output) CodeGraph also scales how much it retrieves: - Base max results: - Small: 10 - Medium: 25 - Large: 50 - Massive: 100 - Over-retrieval multipliers: - Local search: 5 / 8 / 10 / 15 (Small→Massive) - Cloud+rerank: 3 / 4 / 5 / 8 (Small→Massive) Separately, MCP responses are capped to stay under common client limits: CodeGraph uses a safe ceiling of **44,200 output tokens** for tool responses even if your model can generate more. ## Why small local models can still work well With “vanilla” vector search, a client-side code agent typically has to: - guess what to search for, - run multiple searches, - fetch large blobs of code, - spend tokens to stitch and reason over results, - repeat until it “finds the right area”. CodeGraph shifts much of that exploration cost into: - a graph database (structural relationships), and - an agent that can chain purpose-built graph tools. So even if your configured model is small (Small/Medium tier), the agent often only needs: 1. a few targeted tool calls to pull the right snippets + relationships, and 2. a short synthesis step to explain the result. The result is less “token burn” on exploration and more remaining context budget for actually implementing changes in your external code agent. ## Why massive-context models still matter Massive tier models (hundreds of thousands to ~2M context windows) can be genuinely helpful when you want: - deeper multi-perspective architectural reasoning, - broad “whole codebase” review narratives, - more exhaustive call-chain exploration with multiple alternative hypotheses. CodeGraph’s exploratory tier prompts and higher retrieval/step budgets are designed to take advantage of those models without forcing smaller models into failure modes (too much retrieved context, too many steps, or huge outputs). ## Practical configuration tips 1. If you use agentic tools, set `llm.enabled = true` and a working provider in `./.codegraph.toml` or `~/.codegraph/config.toml`. 2. Set `llm.context_window` to match your actual model, or override with `CODEGRAPH_CONTEXT_WINDOW`. 3. If you hit MCP client output issues, reduce `llm.context_window` or lower your tool requests (smaller limits), rather than increasing outputs. For provider setup examples, see `docs/AI_PROVIDERS.md`.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Jakedismo/codegraph-rust'

If you have feedback or need assistance with the MCP directory API, please join our Discord server