Skip to main content
Glama

search_context

Search codebases using natural language queries to locate relevant code snippets. Automatically indexes projects to provide up-to-date results with file paths and line numbers for development workflows.

Instructions

Search for relevant code context based on a query within a specific project. This tool automatically performs incremental indexing before searching, ensuring results are always up-to-date. Returns formatted text snippets from the codebase that are semantically related to your query. IMPORTANT: Use forward slashes (/) as path separators in project_root_path, even on Windows.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_root_pathYesAbsolute path to the project root directory. Use forward slashes (/) as separators. Example: C:/Users/username/projects/myproject
queryYesProvide a clear natural language description of the code behavior, workflow, or issue you want to locate. You may also add optional keywords to improve semantic matching. Recommended format: Natural language description + optional keywords Examples: “I want to find where the server handles chunk merging in the file upload process. Keywords: upload chunk merge, file service” “Locate where the system refreshes cached data after user permissions are updated. Keywords: permission update, cache refresh” “Find the initialization flow of message queue consumers during startup. Keywords: mq consumer init, subscribe” “Show me how configuration hot-reload is triggered and applied in the code. Keywords: config reload, hot update”

Implementation Reference

  • Primary handler function for the 'search_context' MCP tool. Validates input arguments, initializes shared IndexManager, calls search_context on it, and formats the result as MCP content.
    async def search_context_tool(arguments: dict[str, Any]) -> dict[str, Any]:
        """Search for code context based on query.
    
        Args:
            arguments: Tool arguments containing:
                - project_root_path: Absolute path to the project root directory
                - query: Search query string
    
        Returns:
            Dictionary containing search results
    
        """
        try:
            project_root_path = arguments.get("project_root_path")
            query = arguments.get("query")
    
            if not project_root_path:
                return {"type": "text", "text": "Error: project_root_path is required"}
    
            if not query:
                return {"type": "text", "text": "Error: query is required"}
    
            logger.info(f"Tool invoked: search_context for project {project_root_path} with query: {query}")
    
            index_manager = await _get_index_manager()
            result = await index_manager.search_context(project_root_path, query)
    
            return {"type": "text", "text": result}
    
        except Exception as e:
            logger.exception("Error in search_context_tool")
            return {"type": "text", "text": f"Error: {e!s}"}
  • MCP server registration of the 'search_context' tool via @app.list_tools(), defining its metadata, description, and full input schema.
    @app.list_tools()
    async def list_tools() -> list[Tool]:
        """List available MCP tools.
    
        Returns:
            List of available tools
    
        """
        return [
            Tool(
                name="search_context",
                description="Search for relevant code context based on a query within a specific project. "
                "This tool automatically performs incremental indexing before searching, "
                "ensuring results are always up-to-date. "
                "Returns formatted text snippets from the codebase that are semantically related to your query. "
                "IMPORTANT: Use forward slashes (/) as path separators in project_root_path, even on Windows.",
                inputSchema={
                    "type": "object",
                    "properties": {
                        "project_root_path": {
                            "type": "string",
                            "description": "Absolute path to the project root directory. Use forward slashes (/) as separators. Example: C:/Users/username/projects/myproject",
                        },
                        "query": {
                            "type": "string",
                            "description": """Provide a clear natural language description of the code behavior, workflow, or issue you want to locate. You may also add optional keywords to improve semantic matching.
    Recommended format:
    Natural language description + optional keywords
    Examples:
    “I want to find where the server handles chunk merging in the file upload process. Keywords: upload chunk merge, file service”
    “Locate where the system refreshes cached data after user permissions are updated. Keywords: permission update, cache refresh”
    “Find the initialization flow of message queue consumers during startup. Keywords: mq consumer init, subscribe”
    “Show me how configuration hot-reload is triggered and applied in the code. Keywords: config reload, hot update”""",
                        },
                    },
                    "required": ["project_root_path", "query"],
                },
            ),
        ]
  • MCP server dispatcher for tool calls via @app.call_tool(), routing 'search_context' invocations to the search_context_tool handler.
    @app.call_tool()
    async def call_tool(name: str, arguments: dict) -> dict:
        """Handle tool calls.
    
        Args:
            name: Tool name
            arguments: Tool arguments
    
        Returns:
            Tool execution results
    
        """
        logger.info(f"Tool called: {name} with arguments: {arguments}")
    
        if name == "search_context":
            return await search_context_tool(arguments)
    
        return {"type": "text", "text": f"Unknown tool: {name}"}
  • Core helper method in IndexManager that performs automatic incremental indexing followed by semantic search via external API, excluding failed blobs, and returns formatted code context snippets.
    async def search_context(self, project_root_path: str, query: str) -> str:
        """Search for code context based on query with automatic incremental indexing.
    
        This method automatically performs incremental indexing before searching,
        ensuring the search is always performed on the latest codebase.
    
        Args:
            project_root_path: Absolute path to the project root directory
            query: Search query string
    
        Returns:
            Formatted retrieval result
    
        """
        normalized_path = self._normalize_path(project_root_path)
        logger.info(f"Searching context in project {normalized_path} with query: {query}")
    
        try:
            # Step 1: Automatically perform incremental indexing
            logger.info(f"Auto-indexing project {normalized_path} before search...")
            index_result = await self.index_project(project_root_path)
    
            if index_result["status"] == "error":
                return f"Error: Failed to index project before search. {index_result['message']}"
    
            # Log indexing stats
            if "stats" in index_result:
                stats = index_result["stats"]
                logger.info(f"Auto-indexing completed: total={stats['total_blobs']}, existing={stats['existing_blobs']}, new={stats['new_blobs']}")
    
            # Step 2: Load indexed blob names and exclude failed blobs
            projects = self._load_projects()
            all_blob_names = projects.get(normalized_path, [])
    
            if not all_blob_names:
                return f"Error: No blobs found for project {normalized_path} after indexing."
    
            # Get failed blob hashes and exclude them from search
            failed_blob_hashes = self._get_failed_blob_hashes(normalized_path)
            blob_names = [blob_hash for blob_hash in all_blob_names if blob_hash not in failed_blob_hashes]
    
            excluded_count = len(all_blob_names) - len(blob_names)
            if excluded_count > 0:
                logger.info(f"Excluded {excluded_count} failed blobs from search (total available: {len(all_blob_names)}, searching: {len(blob_names)})")
    
            if not blob_names:
                return f"Error: No valid blobs available for search in project {normalized_path}. All {len(all_blob_names)} blobs have failed upload."
    
            # Step 3: Perform search
            logger.info(f"Performing search with {len(blob_names)} blobs (excluded {excluded_count} failed blobs)...")
            payload = {
                "information_request": query,
                "blobs": {
                    "checkpoint_id": None,
                    "added_blobs": blob_names,
                    "deleted_blobs": [],
                },
                "dialog": [],
                "max_output_length": 0,
                "disable_codebase_retrieval": False,
                "enable_commit_retrieval": False,
            }
    
            client = self._get_client()
    
            async def search_request():
                response = await client.post(
                    f"{self.base_url}/agents/codebase-retrieval",
                    headers={"Authorization": f"Bearer {self.token}"},
                    json=payload,
                )
                response.raise_for_status()
                return response.json()
    
            # Retry up to 3 times with exponential backoff
            try:
                result = await self._retry_request(search_request, max_retries=3, retry_delay=2.0)
            except Exception as e:
                logger.error(f"Search request failed after retries: {e}")
                return f"Error: Search request failed after 3 retries. {e!s}"
    
            formatted_retrieval = result.get("formatted_retrieval", "")
    
            if not formatted_retrieval:
                logger.warning(f"Search returned empty result for project {normalized_path}")
                return "No relevant code context found for your query."
    
            logger.info(f"Search completed for project {normalized_path}")
            return formatted_retrieval
    
        except Exception as e:
            logger.exception(f"Failed to search context in project {normalized_path}")
            return f"Error: {e!s}"
  • Shared IndexManager singleton factory used by the tool handler to lazily initialize the index manager with configuration.
    async def _get_index_manager() -> IndexManager:
        """Create or return the shared IndexManager instance."""
        global _index_manager, _index_manager_lock
    
        if _index_manager is not None:
            return _index_manager
    
        if _index_manager_lock is None:
            _index_manager_lock = asyncio.Lock()
    
        async with _index_manager_lock:
            if _index_manager is None:
                config = get_config()
                _index_manager = IndexManager(
                    config.index_storage_path,
                    config.base_url,
                    config.token,
                    config.text_extensions,
                    config.batch_size,
                    config.max_lines_per_blob,
                    config.exclude_patterns,
                )
    
        return _index_manager
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and adds valuable behavioral context: it discloses that the tool performs automatic incremental indexing before searching (ensuring up-to-date results), returns formatted text snippets, and includes an important platform-specific note about path separators. However, it does not mention rate limits, error handling, or performance characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with key information (purpose, indexing behavior, return format, and critical note) presented efficiently in three sentences. Each sentence adds value without redundancy, making it easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (semantic search with indexing), no annotations, and no output schema, the description does well by explaining the indexing behavior, return format, and parameter nuances. However, it lacks details on output structure, error cases, or limitations, leaving some gaps in completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds meaningful context beyond the schema: it emphasizes the importance of forward slashes in 'project_root_path' and provides guidance on structuring the 'query' with natural language and keywords, which enhances understanding of parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('search for relevant code context') and resources ('within a specific project'), and distinguishes it by mentioning incremental indexing and semantic matching. It provides a complete picture of what the tool does beyond just the name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the mention of 'project_root_path' and 'query', but does not explicitly state when to use this tool versus alternatives. Since no sibling tools are listed, this is adequate but lacks explicit guidance on scenarios or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/qy527145/acemcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server