Tea Rags MCP

TeaRAGs-MCP
website
docs
agent-integration

common-mistakes.md•18 KiB

--- title: Common Mistakes sidebar_position: 5 --- import AiQuery from '@site/src/components/AiQuery'; # Common Mistakes Mistakes that developers and AI agents make when using semantic code search -- from naive RAG pitfalls that apply to any retrieval system, to TeaRAGs-specific errors that waste enrichment signals you already have. ## 1. Not Configuring Your Agent to Use TeaRAGs **The mistake:** TeaRAGs is installed and indexing works, but the agent's system prompt (`CLAUDE.md`, `.cursorrules`, custom system prompt) doesn't mention it. The agent defaults to its built-in file exploration -- `grep`, `find`, reading files one by one -- and never calls the MCP tools. **Why it matters:** Without explicit instructions, agents treat TeaRAGs as just another tool they *might* use. In practice, they almost never reach for it on their own. You're paying the cost of indexing and embedding without getting any of the benefits. **The fix:** Add a search strategy section to your agent configuration: ```markdown ## Search Strategy Use tea-rags MCP server for ALL code search. Do not grep through files manually. Before generating code: 1. Find a stable template: search_code with rerank="stable" - Only use results with bugFixRate < 25% 2. Check target area risk: semantic_search with rerank="hotspots", metaOnly=true 3. Match domain owner: semantic_search with rerank="ownership", metaOnly=true 4. Verify identifiers: use ripgrep to confirm function names exist Never: - Copy code from results with bugFixRate > 50% - Modify single-owner code without flagging the owner ``` See [Activating in Your Agent](/agent-integration/agentic-data-driven-engineering/activating) for complete configuration examples for Claude Code, Cursor, and custom agents. :::warning This is the single most common reason TeaRAGs delivers no value. The tool is available, the index is built, but the agent simply doesn't know to use it. ::: ## 2. Using TeaRAGs as Plain Semantic Search **The mistake:** Treating TeaRAGs as a fancy grep -- searching with `rerank: "relevance"` every time and ignoring the 19 git-derived signals in results. The agent retrieves code, injects it into context, and generates -- without ever looking at `bugFixRate`, `commitCount`, `dominantAuthor`, or any other enrichment signal. **Why it matters:** You get the same results you'd get from any vector search tool. The trajectory enrichment -- churn, stability, authorship, bug-fix rates -- is computed during indexing but never used. The agent copies the first match, which might be a prototype someone abandoned, a pattern that was reverted three times, or a function that breaks every sprint. **What to do instead:** | Task | Preset | Why | |------|--------|-----| | Finding templates to copy | `stable` | Low churn + old age = battle-tested | | Investigating bugs | `recent` then `hotspots` | Recent changes first, then historically fragile code | | Reviewing changes | `codeReview` | Boosts recent burst activity | | Finding refactoring candidates | `refactoring` | Large + churny + high bug-fix rate | | Understanding ownership | `ownership` | Surfaces knowledge silos | See [Mental Model](/agent-integration/mental-model) for the thinking shift from similarity-only to trajectory-aware retrieval. ## 3. Copying the First Search Hit as a Template **The mistake:** The agent finds code that's semantically similar to what it needs and immediately copies it as a template -- without checking whether that code is stable, well-owned, or has a history of bugs. **Why it matters:** Similarity says nothing about quality. A function with 60% `bugFixRate` "looks right" semantically but is the worst possible example to copy. It will introduce the same structural problems that caused 60% of its commits to be bug fixes. **Quality criteria for a good template:** | Signal | Good | Mediocre | Avoid | |--------|------|----------|-------| | `chunkBugFixRate` | 0-15% | 15-35% | > 40% | | `chunkCommitCount` | 1-3 | 4-7 | > 8 | | `chunkAgeDays` | > 60 | 30-60 | < 14 | | `churnVolatility` | < 5 | 5-15 | > 20 | **The fix:** Always use `rerank: "stable"` when searching for templates. If the best match has `bugFixRate > 40%`, find an alternative. See [Template Selection](/agent-integration/agentic-data-driven-engineering#1-template-selection--what-to-copy) for the complete workflow. ## 4. Context Stuffing -- Retrieving Too Many Results **The mistake:** Setting `limit: 50` or higher on every search, dumping all results into the LLM context hoping "more is better." The agent's context fills with marginally relevant code chunks. **Why it matters:** Research confirms that LLM performance degrades significantly when processing inputs beyond ~50% of context length ([Stanford, 2023](https://arxiv.org/abs/2307.03172)). A 2024 study by Chroma found that accuracy drops from 70-75% to 55-60% with just 20 retrieved documents. This is called **context rot** -- progressive decay in accuracy as prompts grow longer. The problem is compounded by the **lost-in-the-middle** effect: relevant information buried in the middle of many chunks gets lower attention from the LLM than information at the beginning or end, creating a U-shaped performance curve. **The fix:** - Use `limit: 5-10` for code generation tasks - Use `limit: 15-20` with `metaOnly: true` for analytics and reporting (no code content, just metadata) - Use `limit: 20-30` only for broad discovery where you'll filter results programmatically - Prefer tight `pathPattern` filters to narrow the candidate set *before* retrieval :::tip `metaOnly: true` returns file paths, git metrics, and chunk metadata *without* the code content. This gives you 10-50x less context while still providing the signals you need for analytical queries. ::: ## 5. Using the Wrong Tool for the Job **The mistake:** Using TeaRAGs for everything -- exact string matching, finding TODOs, listing class methods, verifying function signatures. Or the inverse: using only ripgrep and ignoring semantic search entirely. **Why it matters:** Each tool has a specific strength: | Tool | Strength | Weakness | |------|----------|----------| | **TeaRAGs** | Finding code by meaning/intent | Can't find exact strings or analyze structure | | **tree-sitter** | Classes, methods, signatures, inheritance | Can't search by meaning or find text | | **ripgrep** | Exact strings, TODOs, flags, config keys | Can't understand intent or code structure | **Common anti-patterns:** | What you're doing | Wrong tool | Right tool | |-------------------|-----------|------------| | "How does authentication work?" | ripgrep | TeaRAGs `search_code` | | "Find all callers of `processPayment()`" | TeaRAGs | ripgrep | | "What methods does this class have?" | TeaRAGs | tree-sitter | | "Where are TODOs in the codebase?" | TeaRAGs | ripgrep | | "Understanding unfamiliar code" | ripgrep | TeaRAGs then tree-sitter | **The fix:** Follow the cascade: **intent** (TeaRAGs) -> **structure** (tree-sitter) -> **exact match** (ripgrep). See [Combining with Other Search Tools](/agent-integration/search-strategies/multi-tool-cascade). ## 6. Not Verifying Search Results with Exact-Match Tools **The mistake:** The agent generates code based on semantic search results without verifying that the function names, imports, and types it references actually exist in the codebase. **Why it matters:** Semantic search returns code by *meaning*, not by *literals*. A search for "authentication logic" returns code about login, sessions, and tokens -- but the actual export might be named `validateCredentials()`, not `authenticateUser()`. The agent generates an import for a non-existent function, and the code fails to compile. **Example failure:** ```text 1. Semantic search: "authentication logic" -> Returns src/auth/middleware.ts (high similarity) 2. Agent generates: import { authenticateUser } from './auth/middleware' 3. Reality: The export is named validateCredentials(), not authenticateUser() 4. Result: Compilation error from non-existent import ``` **The fix:** After generating code, verify every referenced identifier: 1. **Function names** -- ripgrep for each function used in generated code 2. **Imports** -- ripgrep for actual module paths and export names 3. **Types** -- ripgrep for interfaces and class names referenced 4. **Structure** -- tree-sitter to confirm method signatures match See [Exact-Match Verification](/agent-integration/agentic-data-driven-engineering/generation-modes#exact-match-verification) for the complete verification workflow. ## 7. Using Only File-Level Metrics **The mistake:** Looking at file-level `commitCount` and `bugFixRate` to assess code quality, missing the function-level granularity that chunk metrics provide. **Why it matters:** A 500-line file with `commitCount = 30` and `bugFixRate = 35%` looks like a hotspot. But inside it: - `processPayment()` -- `chunkCommitCount = 22`, `chunkBugFixRate = 55%` -- **the actual hotspot** - `validateCard()` -- `chunkCommitCount = 4`, `chunkBugFixRate = 25%` -- normal - `formatReceipt()` -- `chunkCommitCount = 1`, `chunkBugFixRate = 0%` -- **stable, good template** Without chunk-level metrics, the agent either avoids the whole file (missing the stable `formatReceipt`) or treats the whole file as equally risky (missing the concentrated problem in `processPayment`). **The fix:** All reranking presets automatically prefer chunk-level data when available. For custom weights, use `chunkChurn` and `chunkRelativeChurn` instead of `churn` and `relativeChurnNorm`. See [File-Level vs Chunk-Level](/agent-integration/deep-codebase-analysis#file-level-vs-chunk-level-metrics-when-to-use-each). ## 8. Overlapping Signals in Custom Reranks **The mistake:** Building custom rerank weights with signals that measure the same underlying thing: ```json { "custom": { "churn": 0.25, "chunkChurn": 0.25, "density": 0.25, "burstActivity": 0.25 } } ``` **Why it matters:** `churn`, `chunkChurn`, `density`, and `burstActivity` are all churn variants. This custom rerank is effectively 100% churn with no other signal -- the four weights don't add unique information, they just triple-count the same thing. **Signal overlap reference:** | Signal group | Members | Pick one | |-------------|---------|----------| | **Churn frequency** | `churn`, `chunkChurn`, `density`, `burstActivity` | `chunkChurn` for function-level | | **Churn magnitude** | `relativeChurnNorm`, `chunkRelativeChurn` | `chunkRelativeChurn` for function-level | | **Age/freshness** | `age`, `recency` | `recency` for recent code, `age` for old | | **Ownership** | `ownership`, `knowledgeSilo` | `knowledgeSilo` for binary silo detection | **The fix:** Use 3-5 *orthogonal* signals that each add unique information: ```json { "custom": { "chunkChurn": 0.25, "bugFix": 0.3, "imports": 0.25, "volatility": 0.2 } } ``` See [Custom Rerank Strategies](/agent-integration/search-strategies/custom-reranking). ## 9. Ignoring Git Enrichment Entirely **The mistake:** Running TeaRAGs with `CODE_ENABLE_GIT_METADATA=false` (the default) and never enabling it. **Why it matters:** Without git enrichment, all reranking presets except `relevance` silently degrade to similarity-only scoring. The agent asks for `hotspots` or `techDebt` or `ownership`, but gets plain cosine similarity results. There's no error message -- the presets just don't work. This also means: - No `bugFixRate` -- can't identify code that keeps breaking - No `commitCount` -- can't distinguish stable from churny code - No `dominantAuthor` -- can't identify knowledge silos - No `ageDays` -- can't find legacy code - No `taskIds` -- can't trace code to tickets **The fix:** Enable git enrichment during indexing: ```bash claude mcp add tea-rags -s user -- node /path/to/tea-rags-mcp/build/index.js \ -e CODE_ENABLE_GIT_METADATA=true ``` Then reindex. Git enrichment runs concurrently with embedding and doesn't increase indexing time. ## 10. Single-Shot Search Instead of Iterative Refinement **The mistake:** Running one search, taking the results, and moving on. If the first search doesn't find what the agent needs, it gives up or starts reading files randomly. **Why it matters:** Semantic search is a discovery tool, not an answer engine. The first search narrows the candidate zone. Subsequent searches with different presets, tighter filters, or refined queries progressively focus the results. **What iterative refinement looks like:** | Step | Action | Purpose | |------|--------|---------| | 1 | `search_code` with `rerank: "relevance"` | **Discover** -- find the target area | | 2 | `semantic_search` with `rerank: "hotspots"`, `metaOnly: true` | **Analyze** -- assess risk of the area | | 3 | `semantic_search` with `rerank: "ownership"`, `metaOnly: true` | **Assess** -- identify who owns the code | | 4 | Read specific files from results | **Act** -- make informed changes | | 5 | `semantic_search` with `rerank: "impactAnalysis"` | **Verify** -- confirm blast radius | See [Agentic Flow Template](/agent-integration/search-strategies/preset-mapping#agentic-flow-template) for the general pattern. ## 11. Hardcoding a Single Preset **The mistake:** Setting `rerank: "hotspots"` (or any single preset) in the agent configuration and using it for every search, regardless of the task. **Why it matters:** Different subtasks need different presets. Using `hotspots` for onboarding points newcomers at the most confusing, unstable code. Using `relevance` for everything ignores the valuable git signals. Using `stable` for bug investigation hides the recently changed code that likely caused the issue. **Preset selection guide:** | Task | Correct preset | Wrong preset | |------|---------------|--------------| | Bug investigation | `recent` then `hotspots` | `stable` (hides recent changes) | | Finding templates | `stable` | `hotspots` (returns worst code) | | Onboarding | `onboarding` | `hotspots` (confusing, unrepresentative) | | Security audit | `securityAudit` | `relevance` (ignores age and path risk) | | Code review | `codeReview` | `ownership` (wrong signal for reviewing changes) | **The fix:** Select preset based on the current step of the workflow, not the overall task. See [Agent Task to Preset Mapping](/agent-integration/search-strategies/preset-mapping). ## 12. Modifying Legacy Code Without Risk Assessment **The mistake:** The agent finds the code it needs to modify, makes the change, and moves on -- without checking churn history, bug-fix rate, or ownership. **Why it matters:** Code with `ageDays > 90` and `bugFixRate > 50%` has been rewritten multiple times and keeps breaking. Any new patch has a high probability of introducing another bug. Code with `dominantAuthorPct > 90%` is a knowledge silo -- modifying it without consulting the owner risks breaking undocumented assumptions. **Risk indicators:** | Signal | Threshold | Risk | |--------|-----------|------| | `bugFixRate > 40%` + `ageDays > 60` | Legacy fragile | Use wrapper pattern + feature flag | | `dominantAuthorPct > 85%` | Knowledge silo | Request review from the owner | | `relativeChurn > 5.0` | Rewritten multiple times | Propose a rewrite, don't patch | | `churnVolatility > 30` + `bugFixRate > 40%` | Pathological churn | Needs redesign, not more patches | **The fix:** Run a danger zone check before any modification: ```json semantic_search({ "query": "target area", "rerank": "hotspots", "metaOnly": true, "limit": 10 }) ``` See [Danger Zone Check](/agent-integration/agentic-data-driven-engineering/generation-modes#danger-zone-check-step-2) and [Generation Mode Switching](/agent-integration/agentic-data-driven-engineering/generation-modes). ## Naive RAG vs TeaRAGs Many of the mistakes above stem from patterns that work for document-oriented RAG but fail for code search. Here's why: | Naive RAG assumption | Why it fails for code | TeaRAGs approach | |---------------------|----------------------|-----------------| | "Similar = relevant" | A prototype and a production implementation look similar but differ in quality | Trajectory signals distinguish stable from volatile code | | "More context = better" | LLM performance degrades with 20+ chunks ([context rot](https://arxiv.org/abs/2307.03172)) | `metaOnly`, tight `limit`, focused `pathPattern` | | "Any match will do" | First hit might have 60% bug-fix rate | `stable` preset finds battle-tested code | | "Flat ranked list" | Position 1 isn't always the best template | Reranking by quality signals, not just similarity | | "Code is text" | Code has structure, ownership, evolution history | 19 git-derived signals at chunk level | | "Search once, done" | Real engineering requires iterative refinement | Multi-step workflows with preset switching | | "Retrieval = answers" | Semantic search is a candidate zone generator | Verification step with ripgrep and tree-sitter | For the academic critique and established counter-arguments, see [Semantic Search: Criticism and Responses](/knowledge-base/semantic-search-criticism). ## Quick Checklist Before shipping an agent workflow that uses TeaRAGs: - [ ] Agent configuration (`CLAUDE.md` / `.cursorrules`) explicitly instructs the agent to use TeaRAGs - [ ] `CODE_ENABLE_GIT_METADATA=true` is set during indexing - [ ] Agent uses different rerank presets for different subtasks (not hardcoded) - [ ] Templates are selected by quality signals, not just similarity - [ ] Generated code is verified with ripgrep / tree-sitter before completion - [ ] `metaOnly: true` is used for analytics queries - [ ] `limit` is set to 5-10 for code generation, 15-20 for analytics - [ ] Agent checks risk signals before modifying existing code ## See Also - [Mental Model](/agent-integration/mental-model) -- the thinking shift from similarity-only to trajectory-aware retrieval - [Search Strategies](/agent-integration/search-strategies) -- multi-step workflows with preset selection - [Deep Codebase Analysis](/agent-integration/deep-codebase-analysis) -- metric interpretation, custom reranks - [Agentic Data-Driven Engineering](/agent-integration/agentic-data-driven-engineering) -- generation modes, danger zone checks, verification - [Semantic Search: Criticism and Responses](/knowledge-base/semantic-search-criticism) -- academic critique and counter-arguments

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/artk0de/TeaRAGs-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

common-mistakes.md•18 KiB