Skip to main content
Glama

find_files

Search for files in code repositories using glob patterns to locate specific files, check file existence, or gather file lists for analysis.

Instructions

Find files matching a glob pattern using pre-built file index.

Use when:
- Looking for files by pattern (e.g., "*.py", "test_*.js")
- Searching by filename only (e.g., "README.md" finds all README files)
- Checking if specific files exist in the project
- Getting file lists for further analysis

Pattern matching:
- Supports both full path and filename-only matching
- Uses standard glob patterns (*, ?, [])
- Fast lookup using in-memory file index
- Uses forward slashes consistently across all platforms

Args:
    pattern: Glob pattern to match files (e.g., "*.py", "test_*.js", "README.md")

Returns:
    List of file paths matching the pattern

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
patternYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The primary MCP tool handler and registration for the 'find_files' tool. It defines the tool interface, documentation, and delegates execution to FileDiscoveryService.
    @handle_mcp_tool_errors(return_type='list')
    def find_files(pattern: str, ctx: Context) -> List[str]:
        """
        Find files matching a glob pattern using pre-built file index.
    
        Use when:
        - Looking for files by pattern (e.g., "*.py", "test_*.js")
        - Searching by filename only (e.g., "README.md" finds all README files)
        - Checking if specific files exist in the project
        - Getting file lists for further analysis
    
        Pattern matching:
        - Supports both full path and filename-only matching
        - Uses standard glob patterns (*, ?, [])
        - Fast lookup using in-memory file index
        - Uses forward slashes consistently across all platforms
    
        Args:
            pattern: Glob pattern to match files (e.g., "*.py", "test_*.js", "README.md")
    
        Returns:
            List of file paths matching the pattern
        """
        return FileDiscoveryService(ctx).find_files(pattern)
  • Helper service implementing business logic for file discovery: input validation, limiting results, and delegation to the shallow index manager.
    def find_files(self, pattern: str, max_results: Optional[int] = None) -> List[str]:
        """
        Find files matching the given pattern using JSON indexing.
    
        Args:
            pattern: Glob pattern to search for (e.g., "*.py", "test_*.js")
            max_results: Maximum number of results to return (None for no limit)
    
        Returns:
            List of file paths matching the pattern
    
        Raises:
            ValueError: If pattern is invalid or project not set up
        """
        # Business validation
        self._validate_discovery_request(pattern)
    
        # Get files from JSON index
        files = self._index_manager.find_files(pattern)
        
        # Apply max_results limit if specified
        if max_results and len(files) > max_results:
            files = files[:max_results]
        
        return files
  • Core implementation of glob pattern matching in the ShallowIndexManager. Converts glob patterns to regex, applies multiple matching strategies (exact, recursive, case-insensitive), and returns matching file paths from the in-memory index.
    def find_files(self, pattern: str = "*") -> List[str]:
        with self._lock:
            if not isinstance(pattern, str):
                return []
            norm = (pattern.strip() or "*").replace('\\\\','/').replace('\\','/')
            files = self._file_list or []
    
            # Fast path: wildcard all
            if norm == "*":
                return list(files)
    
            # 1) Exact, case-sensitive
            exact_regex = self._compile_glob_regex(norm)
            exact_hits = [f for f in files if exact_regex.match(f) is not None]
            if exact_hits or '/' in norm:
                return exact_hits
    
            # 2) Recursive **/ fallback (case-sensitive)
            recursive_pattern = f"**/{norm}"
            rec_regex = self._compile_glob_regex(recursive_pattern)
            rec_hits = [f for f in files if rec_regex.match(f) is not None]
            if rec_hits:
                return self._dedupe_preserve_order(exact_hits + rec_hits)
    
            # 3) Case-insensitive (root only)
            ci_regex = self._compile_glob_regex(norm, ignore_case=True)
            ci_hits = [f for f in files if ci_regex.match(f) is not None]
            if ci_hits:
                return self._dedupe_preserve_order(exact_hits + rec_hits + ci_hits)
    
            # 4) Case-insensitive recursive
            rec_ci_regex = self._compile_glob_regex(recursive_pattern, ignore_case=True)
            rec_ci_hits = [f for f in files if rec_ci_regex.match(f) is not None]
            if rec_ci_hits:
                return self._dedupe_preserve_order(
                    exact_hits + rec_hits + ci_hits + rec_ci_hits
                )
    
            return []
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and does well by disclosing key behavioral traits: it uses a 'pre-built file index' for 'fast lookup', supports 'standard glob patterns', handles 'full path and filename-only matching', and 'uses forward slashes consistently across all platforms'. It doesn't mention error handling or performance limits, keeping it from a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections ('Use when:', 'Pattern matching:', 'Args:', 'Returns:'), front-loaded with the core purpose, and every sentence adds value (e.g., explaining pattern behavior, usage scenarios). No redundant or vague statements.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (1 parameter, no annotations, output schema exists), the description is complete: it covers purpose, usage, parameter details, behavioral traits, and return values. The output schema handles return structure, so the description's brief 'Returns:' statement is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate fully. It does so by explaining the 'pattern' parameter in detail: defines it as a 'Glob pattern to match files', provides examples ('*.py', 'test_*.js', 'README.md'), and clarifies pattern semantics in the 'Pattern matching' section, adding significant value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Find files matching a glob pattern using pre-built file index.' It specifies the verb ('Find'), resource ('files'), and mechanism ('using pre-built file index'), and distinguishes from siblings like 'search_code_advanced' by focusing on filename/pattern matching rather than content search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'Use when:' section with four specific scenarios (e.g., 'Looking for files by pattern', 'Searching by filename only'), providing clear guidance on when to use this tool. It implicitly distinguishes from siblings by not covering content search or indexing operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/johnhuang316/code-index-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server