search_for_pattern
Search for patterns in code and non-code files using regular expressions, with flexible file filtering options for targeted results.
Instructions
Offers a flexible search for arbitrary patterns in the codebase, including the possibility to search in non-code files. Generally, symbolic operations like find_symbol or find_referencing_symbols should be preferred if you know which symbols you are looking for.
Pattern Matching Logic: For each match, the returned result will contain the full lines where the substring pattern is found, as well as optionally some lines before and after it. The pattern will be compiled with DOTALL, meaning that the dot will match all characters including newlines. This also means that it never makes sense to have .* at the beginning or end of the pattern, but it may make sense to have it in the middle for complex patterns. If a pattern matches multiple lines, all those lines will be part of the match. Be careful to not use greedy quantifiers unnecessarily, it is usually better to use non-greedy quantifiers like .*? to avoid matching too much content.
File Selection Logic:
The files in which the search is performed can be restricted very flexibly.
Using restrict_search_to_code_files is useful if you are only interested in code symbols (i.e., those
symbols that can be manipulated with symbolic tools like find_symbol).
You can also restrict the search to a specific file or directory,
and provide glob patterns to include or exclude certain files on top of that.
The globs are matched against relative file paths from the project root (not to the relative_path parameter that
is used to further restrict the search).
Smartly combining the various restrictions allows you to perform very targeted searches. Returns A mapping of file paths to lists of matched consecutive lines.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| substring_pattern | Yes | Regular expression for a substring pattern to search for. | |
| context_lines_before | No | Number of lines of context to include before each match. | |
| context_lines_after | No | Number of lines of context to include after each match. | |
| paths_include_glob | No | Optional glob pattern specifying files to include in the search. Matches against relative file paths from the project root (e.g., "*.py", "src/**/*.ts"). Supports standard glob patterns (*, ?, [seq], **, etc.) and brace expansion {a,b,c}. Only matches files, not directories. If left empty, all non-ignored files will be included. | |
| paths_exclude_glob | No | Optional glob pattern specifying files to exclude from the search. Matches against relative file paths from the project root (e.g., "*test*", "**/*_generated.py"). Supports standard glob patterns (*, ?, [seq], **, etc.) and brace expansion {a,b,c}. Takes precedence over paths_include_glob. Only matches files, not directories. If left empty, no files are excluded. | |
| relative_path | No | Only subpaths of this path (relative to the repo root) will be analyzed. If a path to a single file is passed, only that will be searched. The path must exist, otherwise a `FileNotFoundError` is raised. | |
| restrict_search_to_code_files | No | Whether to restrict the search to only those files where analyzed code symbols can be found. Otherwise, will search all non-ignored files. Set this to True if your search is only meant to discover code that can be manipulated with symbolic tools. For example, for finding classes or methods from a name pattern. Setting to False is a better choice if you also want to search in non-code files, like in html or yaml files, which is why it is the default. | |
| max_answer_chars | No | If the output is longer than this number of characters, no content will be returned. -1 means the default value from the config will be used. Don't adjust unless there is really no other way to get the content required for the task. Instead, if the output is too long, you should make a stricter query. |
Implementation Reference
- src/serena/tools/file_tools.py:373-483 (handler)The SearchForPatternTool class provides the handler logic for the 'search_for_pattern' tool, executing pattern searches in project files with configurable options like globs, context lines, and file restrictions.class SearchForPatternTool(Tool): """ Performs a search for a pattern in the project. """ def apply( self, substring_pattern: str, context_lines_before: int = 0, context_lines_after: int = 0, paths_include_glob: str = "", paths_exclude_glob: str = "", relative_path: str = "", restrict_search_to_code_files: bool = False, max_answer_chars: int = -1, ) -> str: """ Offers a flexible search for arbitrary patterns in the codebase, including the possibility to search in non-code files. Generally, symbolic operations like find_symbol or find_referencing_symbols should be preferred if you know which symbols you are looking for. Pattern Matching Logic: For each match, the returned result will contain the full lines where the substring pattern is found, as well as optionally some lines before and after it. The pattern will be compiled with DOTALL, meaning that the dot will match all characters including newlines. This also means that it never makes sense to have .* at the beginning or end of the pattern, but it may make sense to have it in the middle for complex patterns. If a pattern matches multiple lines, all those lines will be part of the match. Be careful to not use greedy quantifiers unnecessarily, it is usually better to use non-greedy quantifiers like .*? to avoid matching too much content. File Selection Logic: The files in which the search is performed can be restricted very flexibly. Using `restrict_search_to_code_files` is useful if you are only interested in code symbols (i.e., those symbols that can be manipulated with symbolic tools like find_symbol). You can also restrict the search to a specific file or directory, and provide glob patterns to include or exclude certain files on top of that. The globs are matched against relative file paths from the project root (not to the `relative_path` parameter that is used to further restrict the search). Smartly combining the various restrictions allows you to perform very targeted searches. :param substring_pattern: Regular expression for a substring pattern to search for :param context_lines_before: Number of lines of context to include before each match :param context_lines_after: Number of lines of context to include after each match :param paths_include_glob: optional glob pattern specifying files to include in the search. Matches against relative file paths from the project root (e.g., "*.py", "src/**/*.ts"). Supports standard glob patterns (*, ?, [seq], **, etc.) and brace expansion {a,b,c}. Only matches files, not directories. If left empty, all non-ignored files will be included. :param paths_exclude_glob: optional glob pattern specifying files to exclude from the search. Matches against relative file paths from the project root (e.g., "*test*", "**/*_generated.py"). Supports standard glob patterns (*, ?, [seq], **, etc.) and brace expansion {a,b,c}. Takes precedence over paths_include_glob. Only matches files, not directories. If left empty, no files are excluded. :param relative_path: only subpaths of this path (relative to the repo root) will be analyzed. If a path to a single file is passed, only that will be searched. The path must exist, otherwise a `FileNotFoundError` is raised. :param max_answer_chars: if the output is longer than this number of characters, no content will be returned. -1 means the default value from the config will be used. Don't adjust unless there is really no other way to get the content required for the task. Instead, if the output is too long, you should make a stricter query. :param restrict_search_to_code_files: whether to restrict the search to only those files where analyzed code symbols can be found. Otherwise, will search all non-ignored files. Set this to True if your search is only meant to discover code that can be manipulated with symbolic tools. For example, for finding classes or methods from a name pattern. Setting to False is a better choice if you also want to search in non-code files, like in html or yaml files, which is why it is the default. :return: A mapping of file paths to lists of matched consecutive lines. """ abs_path = os.path.join(self.get_project_root(), relative_path) if not os.path.exists(abs_path): raise FileNotFoundError(f"Relative path {relative_path} does not exist.") if restrict_search_to_code_files: matches = self.project.search_source_files_for_pattern( pattern=substring_pattern, relative_path=relative_path, context_lines_before=context_lines_before, context_lines_after=context_lines_after, paths_include_glob=paths_include_glob.strip(), paths_exclude_glob=paths_exclude_glob.strip(), ) else: if os.path.isfile(abs_path): rel_paths_to_search = [relative_path] else: _dirs, rel_paths_to_search = scan_directory( path=abs_path, recursive=True, is_ignored_dir=self.project.is_ignored_path, is_ignored_file=self.project.is_ignored_path, relative_to=self.get_project_root(), ) # TODO (maybe): not super efficient to walk through the files again and filter if glob patterns are provided # but it probably never matters and this version required no further refactoring matches = search_files( rel_paths_to_search, substring_pattern, file_reader=self.project.read_file, root_path=self.get_project_root(), paths_include_glob=paths_include_glob, paths_exclude_glob=paths_exclude_glob, ) # group matches by file file_to_matches: dict[str, list[str]] = defaultdict(list) for match in matches: assert match.source_file_path is not None file_to_matches[match.source_file_path].append(match.to_display_string()) result = self._to_json(file_to_matches) return self._limit_length(result, max_answer_chars)
- src/serena/tools/tools_base.py:356-427 (registration)ToolRegistry automatically discovers and registers all subclasses of Tool in 'serena.tools' packages, including SearchForPatternTool, mapping class names to snake_case tool names like 'search_for_pattern'.class ToolRegistry: def __init__(self) -> None: self._tool_dict: dict[str, RegisteredTool] = {} for cls in iter_subclasses(Tool): if not any(cls.__module__.startswith(pkg) for pkg in tool_packages): continue is_optional = issubclass(cls, ToolMarkerOptional) name = cls.get_name_from_cls() if name in self._tool_dict: raise ValueError(f"Duplicate tool name found: {name}. Tool classes must have unique names.") self._tool_dict[name] = RegisteredTool(tool_class=cls, is_optional=is_optional, tool_name=name) def get_tool_class_by_name(self, tool_name: str) -> type[Tool]: return self._tool_dict[tool_name].tool_class def get_all_tool_classes(self) -> list[type[Tool]]: return list(t.tool_class for t in self._tool_dict.values()) def get_tool_classes_default_enabled(self) -> list[type[Tool]]: """ :return: the list of tool classes that are enabled by default (i.e. non-optional tools). """ return [t.tool_class for t in self._tool_dict.values() if not t.is_optional] def get_tool_classes_optional(self) -> list[type[Tool]]: """ :return: the list of tool classes that are optional (i.e. disabled by default). """ return [t.tool_class for t in self._tool_dict.values() if t.is_optional] def get_tool_names_default_enabled(self) -> list[str]: """ :return: the list of tool names that are enabled by default (i.e. non-optional tools). """ return [t.tool_name for t in self._tool_dict.values() if not t.is_optional] def get_tool_names_optional(self) -> list[str]: """ :return: the list of tool names that are optional (i.e. disabled by default). """ return [t.tool_name for t in self._tool_dict.values() if t.is_optional] def get_tool_names(self) -> list[str]: """ :return: the list of all tool names. """ return list(self._tool_dict.keys()) def print_tool_overview( self, tools: Iterable[type[Tool] | Tool] | None = None, include_optional: bool = False, only_optional: bool = False ) -> None: """ Print a summary of the tools. If no tools are passed, a summary of the selection of tools (all, default or only optional) is printed. """ if tools is None: if only_optional: tools = self.get_tool_classes_optional() elif include_optional: tools = self.get_all_tool_classes() else: tools = self.get_tool_classes_default_enabled() tool_dict: dict[str, type[Tool] | Tool] = {} for tool_class in tools: tool_dict[tool_class.get_name_from_cls()] = tool_class for tool_name in sorted(tool_dict.keys()): tool_class = tool_dict[tool_name] print(f" * `{tool_name}`: {tool_class.get_tool_description().strip()}") def is_valid_tool_name(self, tool_name: str) -> bool: return tool_name in self._tool_dict
- Generates the input schema (FuncMetadata) for MCP tools from the apply method's signature and annotations, used for SearchForPatternTool.def get_apply_fn_metadata_from_cls(cls) -> FuncMetadata: """Get the metadata for the apply method from the class (static metadata). Needed for creating MCP tools in a separate process without running into serialization issues. """ # First try to get from __dict__ to handle dynamic docstring changes if "apply" in cls.__dict__: apply_fn = cls.__dict__["apply"] else: # Fall back to getattr for inherited methods apply_fn = getattr(cls, "apply", None) if apply_fn is None: raise AttributeError(f"apply method not defined in {cls}. Did you forget to implement it?") return func_metadata(apply_fn, skip_names=["self", "cls"])