Qdrant RAG MCP Server

v0.3.4.post1-token-optimization-plan.md•7.17 KiB

# v0.3.4.post1 - Token Usage Optimization Plan ## Problem Statement When analyzing GitHub issues, Claude is not using the dedicated `github_analyze_issue` MCP tool, instead performing manual searches that bypass all optimizations. This results in: 1. **Excessive token usage** - Multiple manual searches with full context 2. **No progressive context** - GitHub integration doesn't use v0.3.2 features 3. **Verbose responses** - Despite config set to "summary", full search results are included 4. **Tool selection confusion** - Claude doesn't recognize when to use specialized tools ## Root Causes ### 1. Claude Not Using the Right Tools - When asked to "analyze issue #123", Claude performs manual `search_code` calls - This bypasses the optimized `github_analyze_issue` tool entirely - Each manual search returns 10 results with full context expansion ### 2. GitHub Integration Missing Progressive Context ```python # Current implementation - no progressive parameters results = self.search_functions["search_code"]( query=query_text, n_results=self.search_limit, # 10 results include_context=self.context_expansion, # Always true include_dependencies=self.include_dependencies # Always true ) ``` ### 3. Configuration Issues - `search_limit: 10` is too high - `response_verbosity: "summary"` is not properly checked - `include_raw_search_results: false` gets overridden ## Proposed Solutions ### 1. Enhanced MCP Tool Definitions #### A. Clear Usage Triggers in Docstrings ```python @mcp.tool() def github_analyze_issue(issue_number: int) -> Dict[str, Any]: """ Perform comprehensive analysis of a GitHub issue using RAG search. WHEN TO USE THIS TOOL: - User asks to "analyze issue #X" - User asks "what is issue #X about?" - User asks for "issue analysis", "investigate issue", "understand issue" - User wants to know what code/files are related to an issue - ALWAYS use this instead of manual search when analyzing GitHub issues This tool automatically: - Fetches issue details and comments - Extracts errors, code references, and keywords - Performs optimized RAG searches with progressive context - Returns summarized analysis with recommendations Args: issue_number: Issue number to analyze Returns: Analysis results with search results and recommendations """ ``` #### B. Tool Selection Guide ```python """ TOOL SELECTION GUIDE FOR CLAUDE: GitHub Issue Operations: - Analyzing/Understanding issues → github_analyze_issue (NOT manual search) - Just getting issue details → github_get_issue - Creating comments → github_add_comment - Suggesting fixes → github_suggest_fix Search Operations: - General search → search (uses progressive context) - Code-specific → search_code (includes language filters) - Documentation → search_docs (optimized for markdown) - Config files → search_config (handles JSON/YAML/XML) IMPORTANT: Always prefer specialized tools over general search """ ``` ### 2. Add Progressive Context to GitHub Integration #### A. Update `_perform_rag_searches` in `issue_analyzer.py` ```python def _perform_rag_searches(self, extracted_info: Dict[str, Any]) -> Dict[str, Any]: # Determine context level based on issue type issue_type = extracted_info.get("issue_type", "unknown") context_level = { "bug": "method", # Bugs need implementation details "feature": "class", # Features need structure overview "documentation": "file", # Docs need high-level understanding "unknown": "class" # Default to class level }.get(issue_type, "class") # Use progressive context in searches for query_type, query_text in queries: try: if query_type in ["function", "class", "error"]: results = self.search_functions["search_code"]( query=query_text, n_results=self.search_limit, include_context=self.context_expansion, include_dependencies=self.include_dependencies, # Add progressive context parameters progressive_mode=True, context_level=context_level, include_expansion_options=True, semantic_cache=True ) ``` ### 3. Configuration Updates #### A. Update `server_config.json` ```json { "github": { "issues": { "analysis": { "search_limit": 5, // Reduce from 10 to 5 "context_expansion": true, "include_dependencies": false, // Default to false "response_verbosity": "summary", "include_raw_search_results": false, "progressive_context": { "enabled": true, "default_level": "class", "bug_level": "method", "feature_level": "file" } } } } } ``` #### B. Fix Response Verbosity Check ```python # Line 114 in issue_analyzer.py # Change from: if include_raw_search and response_verbosity == "full": # To: if include_raw_search and response_verbosity != "summary": ``` ### 4. Query Deduplication Add deduplication before performing searches: ```python def _perform_rag_searches(self, extracted_info: Dict[str, Any]) -> Dict[str, Any]: # Deduplicate queries seen_queries = set() unique_queries = [] for query_type, query_text in queries: normalized = query_text.lower().strip() if normalized not in seen_queries and len(normalized) > 3: seen_queries.add(normalized) unique_queries.append((query_type, query_text)) # Limit total queries unique_queries = unique_queries[:8] # Max 8 unique queries ``` ## Expected Results ### Token Usage Reduction - **Current**: ~50,000+ tokens for issue analysis - **Expected**: ~10,000-15,000 tokens (70% reduction) ### How It Works 1. **Progressive Context**: Start with class-level (50% tokens), expand only if needed 2. **Reduced Search Results**: 5 instead of 10 results 3. **Smart Context Level**: Bug=method, Feature=class, Docs=file 4. **Query Deduplication**: Avoid redundant searches 5. **Summary Responses**: Only essential information returned ## Implementation Steps 1. **v0.3.4.post1 Release**: - Update tool docstrings with usage triggers - Add progressive context to GitHub integration - Fix response verbosity check - Update default configuration - Add query deduplication 2. **Testing**: - Compare token usage before/after - Verify search quality maintained - Test different issue types (bug/feature/doc) - Ensure Claude uses correct tools 3. **Documentation**: - Update GitHub integration guide - Add token optimization tips - Document progressive context usage ## Monitoring Track these metrics after implementation: - Average tokens per issue analysis - Tool selection accuracy (github_analyze_issue vs manual search) - Search result quality scores - Cache hit rates for semantic caching ## Future Enhancements 1. **Tool Aliases System**: Map natural language to tools 2. **Automatic Tool Suggestion**: Suggest tools based on query 3. **Token Budget Awareness**: Adjust search depth based on remaining tokens 4. **Smart Result Truncation**: Include only most relevant snippets

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ancoleman/qdrant-rag-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

v0.3.4.post1-token-optimization-plan.md•7.17 KiB