Skip to main content
Glama

find_files_by_chunk_content

Find files in a project that contain chunks with specified text. Returns a file tree showing which files have matches.

Instructions

Step 1: Find files containing chunks with matching text.

Returns file tree only showing which files contain matches.
You must use find_matching_chunks_in_file on each relevant file
to see the actual matches.

Example workflow:
1. Find files:
   files = find_files_by_chunk_content(project, ["MyClass"])
2. For each file, find actual matches:
   matches = find_matching_chunks_in_file(file, ["MyClass"])
3. Get content:
   content = chunk_details(file, match_id)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_nameYes
chunk_contents_filterYesMatch if any of these strings appear. Match all if None/null. Single empty string or empty list will match all.

Implementation Reference

  • The MCP tool handler function `find_files_by_chunk_content`. It is decorated with @mcp.tool() and @log_inputs_outputs(), and delegates to `_filter_files_by_chunk` with filter_on='name_or_content'.
    @mcp.tool()
    @log_inputs_outputs()
    def find_files_by_chunk_content(
        project_name: str,
        chunk_contents_filter: FilterType,
    ) -> ToolResponse:
        """Step 1: Find files containing chunks with matching text.
    
        Returns file tree only showing which files contain matches.
        You must use find_matching_chunks_in_file on each relevant file
        to see the actual matches.
    
        Example workflow:
        1. Find files:
           files = find_files_by_chunk_content(project, ["MyClass"])
        2. For each file, find actual matches:
           matches = find_matching_chunks_in_file(file, ["MyClass"])
        3. Get content:
           content = chunk_details(file, match_id)
        """
        return _filter_files_by_chunk(project_name, chunk_contents_filter, "name_or_content").render()
  • Registration via @mcp.tool() decorator on the `find_files_by_chunk_content` function, making it available as an MCP tool.
    @mcp.tool()
    @log_inputs_outputs()
  • The `_filter_files_by_chunk` helper function that performs the actual logic: iterates over project files and their chunks, checks if any chunk matches the filter (via `chunk.matches_filter`), and returns a file tree of matches.
    def _filter_files_by_chunk(
        project_name: str,
        filter_: FilterType,
        filter_on: Literal["name", "name_or_content"],
    ) -> MCPToolOutput:
        project = _get_project_or_error(project_name)
        matching_files: set[pathlib.Path] = set()
        for file in project.chunk_project.files:
            if any(c.matches_filter(filter_, filter_on) for c in file.chunks):
                matching_files.add(file.abs_path)
        data = create_file_tree(project_root=project.root, paths=matching_files)
        if data is None:
            return MCPToolOutput(text="No files found")
        elif isinstance(data, str):
            return MCPToolOutput(text=data)
        else:
            assert_never(data)
  • The `Chunk.matches_filter` method used by _filter_files_by_chunk. It checks whether the chunk's name and/or content matches the filter criteria.
    def matches_filter(
        self,
        filter_: None | list[str] | str,
        filter_on: Literal["name", "content", "name_or_content"],
    ) -> bool:
        """Return True if the chunk's name matches the given filter.
    
        str matches if the chunk's name contains the string.
        list[str] matches if the chunk's name contains any of the strings in the list.
        None matches all chunks.
        """
        if filter_on == "name":
            data = self.name
        elif filter_on == "content":
            data = self.content
        elif filter_on == "name_or_content":
            data = self.content + self.name
        else:
            assert_never(filter_on)
        return matches_filter(filter_, data)
  • The `matches_filter` utility function that performs the actual string matching logic (None matches all, str checks containment, list checks any containment).
    def matches_filter(filter_: None | list[str] | str, data: str | None) -> bool:
        """Return True if the data matches the given filter.
    
        filter_ can be:
        - None matches all data
        - str matches if the data contains the string. Empty string matches all.
        - list[str] matches if the data contains any of the strings in the list. Empty list matches all.
    
        I find the LLM likes to use an empty list to mean "all" even though it should probably
        use None so 🤷
    
        if data is None it never matches (unless filter_ is None)
        """
        if filter_ is None:
            return True
        if len(filter_) == 0:
            return True
        if data is None:
            return False
        if isinstance(filter_, str):
            return filter_ in data
        if isinstance(filter_, list):
            return any(x in data for x in filter_)
        assert_never(filter_)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that the tool only returns a file tree, not the matches themselves, and that follow-up steps are required. This adds behavioral context beyond the input schema, though it does not cover error handling or permissions. With no annotations provided, the description carries the full burden and does so adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a step-by-step workflow and code example, making it easy to follow. While it is not extremely concise, every sentence adds value and is front-loaded with the core purpose. Minor redundancy in the workflow explanation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks an explicit return value structure, stating only 'file tree' without details on its format. Given no output schema, more specificity would help. However, it does provide the workflow context and example, making it partially complete for a simple search tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 50% description coverage (only chunk_contents_filter has a description). The description does not elaborate on parameters beyond the example and workflow, adding marginal value. Baseline is 3 due to moderate schema coverage, and the description compensates only slightly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds files containing chunks with matching text and returns only the file tree. It distinguishes from sibling tools like find_matching_chunks_in_file which show actual matches, making the purpose specific and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly outlines a three-step workflow, instructing when to use this tool first, then to use find_matching_chunks_in_file for actual matches, and finally chunk_details for content. This provides clear usage guidance and differentiates from alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jurasofish/mcpunk'

If you have feedback or need assistance with the MCP directory API, please join our Discord server