Tea Rags MCP

TeaRAGs-MCP
website
docs
agent-integration
deep-codebase-analysis

index.md•4.61 KiB

--- title: "Deep Codebase Analysis" sidebar_position: 1 --- import AiQuery from '@site/src/components/AiQuery'; # Deep Codebase Analysis TeaRAGs exposes git-derived signals at **two granularity levels** — file and chunk (function). Understanding when to use which level is the key to meaningful analysis. This page covers **metric interpretation, threshold tables, and decision frameworks** — what the numbers mean and how to read them. For which tools and presets to use for each task, see [Search Strategies](/agent-integration/search-strategies). For how agents should use these signals during code generation, see [Agentic Data-Driven Engineering](/agent-integration/agentic-data-driven-engineering). ## File-Level vs Chunk-Level Metrics: When to Use Each Every indexed chunk carries both file-level and chunk-level git metrics. They measure different things and answer different questions. ### File-level metrics File-level metrics (`commitCount`, `relativeChurn`, `bugFixRate`, `ageDays`, `dominantAuthor`) describe the **file as a whole**. All chunks within the same file share identical file-level values. **Use file-level metrics when:** - **Scanning for general hotspots** — "which files change most?" is a coarse but fast signal. A file with `commitCount >= 20` is worth investigating further. - **Ownership analysis** — `dominantAuthor` and `contributorCount` are inherently file-scoped. Git tracks commits per file, not per function. - **Relative churn assessment** — `relativeChurn` (lines changed / file size) is the strongest single defect predictor according to [Nagappan & Ball (2005)](/knowledge-base/code-churn-research#why-relative-churn-beats-absolute-churn). It normalizes for file size, so a 50-line file with 100 lines changed (`relativeChurn = 2.0`) ranks higher than a 2000-line file with the same changes (`relativeChurn = 0.05`). - **Task traceability** — `taskIds` are extracted from commit messages at file level. - **Legacy code discovery** — `ageDays` at file level tells you when the file was last touched, regardless of which function inside it changed. **Limitations:** A 500-line file with 30 commits may have one function that absorbed 28 of them. File-level `commitCount = 30` makes the whole file look churny, but only one function is the problem. You need chunk-level metrics to see this. ### Chunk-level metrics Chunk-level metrics (`chunkCommitCount`, `chunkChurnRatio`, `chunkBugFixRate`, `chunkAgeDays`) describe a **specific function, method, or code block** within a file. They are computed by mapping diff hunks to chunk line ranges. **Use chunk-level metrics when:** - **Pinpointing the exact problem** — `chunkCommitCount` tells you which function inside a churny file is actually causing the churn. A file with `commitCount = 25` might have one function with `chunkCommitCount = 22` and another with `chunkCommitCount = 1`. - **Refactoring prioritization** — `chunkChurnRatio` (chunk commits / file commits) close to 1.0 means this one function is responsible for nearly all of the file's churn. That function is the refactoring target, not the file. - **Function-level bug density** — `chunkBugFixRate` at 60% means most commits to this specific function were bug fixes. The file-level `bugFixRate` might be only 30% because other functions dilute the signal. - **Stable code inside unstable files** — `chunkAgeDays = 180` inside a file with `ageDays = 2` means this function hasn't been touched in 6 months, even though the file was modified yesterday. This function is stable and reliable as a template. **Limitations:** Chunk-level metrics require the `GIT_CHUNK_ENABLED=true` setting (on by default) and only cover commits within the `GIT_CHUNK_MAX_AGE_MONTHS` window (default: 6 months). Older commits fall back to file-level data. ### Decision guide | Question | Use | Key metric | |----------|-----|------------| | Which files change most? | File | `commitCount`, `relativeChurn` | | Which *function* changes most? | Chunk | `chunkCommitCount`, `chunkChurnRatio` | | Is this file a defect predictor? | File | `relativeChurn` ([Nagappan](/knowledge-base/code-churn-research#why-relative-churn-beats-absolute-churn): 89% accuracy) | | Is this *function* buggy? | Chunk | `chunkBugFixRate` | | Who owns this area? | File | `dominantAuthor`, `dominantAuthorPct` | | Who last touched this function? | Chunk | `chunkAgeDays`, `chunkContributorCount` | | Is the churn healthy or pathological? | Both | Compare `commitCount` vs `bugFixRate` — high commits + low bugfix = healthy iteration; high commits + high bugfix = pathological | | What should I refactor first? | Chunk | `chunkChurnRatio` + `chunkBugFixRate` + chunk size |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/artk0de/TeaRAGs-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

index.md•4.61 KiB