repo_map
Generate a structured repository map that lists function prototypes and variables from specified files, including relevant related files with ranking based on chat context and mentions.
Instructions
Generate a repository map for the specified files, providing a list of function prototypes and variables for files as well as relevant related files. Provide filenames relative to the project_root. In addition to the files provided, relevant related files will also be included with a very small ranking boost.
:param project_root: Root directory of the project to search. (must be an absolute path!) :param chat_files: A list of file paths that are currently in the chat context. These files will receive the highest ranking. :param other_files: A list of other relevant file paths in the repository to consider for the map. They receive a lower ranking boost than mentioned_files and chat_files. :param token_limit: The maximum number of tokens the generated repository map should occupy. Defaults to 8192. :param exclude_unranked: If True, files with a PageRank of 0.0 will be excluded from the map. Defaults to False. :param force_refresh: If True, forces a refresh of the repository map cache. Defaults to False. :param mentioned_files: Optional list of file paths explicitly mentioned in the conversation and receive a mid-level ranking boost. :param mentioned_idents: Optional list of identifiers explicitly mentioned in the conversation, to boost their ranking. :param verbose: If True, enables verbose logging for the RepoMap generation process. Defaults to False. :param max_context_window: Optional maximum context window size for token calculation, used to adjust map token limit when no chat files are provided. :returns: A dictionary containing: - 'map': the generated repository map string - 'report': a dictionary with file processing details including: - 'included': list of processed files - 'excluded': dictionary of excluded files with reasons - 'definition_matches': count of matched definitions - 'reference_matches': count of matched references - 'total_files_considered': total files processed Or an 'error' key if an error occurred.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| chat_files | No | ||
| exclude_unranked | No | ||
| force_refresh | No | ||
| max_context_window | No | ||
| mentioned_files | No | ||
| mentioned_idents | No | ||
| other_files | No | ||
| project_root | Yes | ||
| token_limit | No | ||
| verbose | No |
Implementation Reference
- repomap_server.py:54-176 (handler)The primary handler function for the 'repo_map' MCP tool. Processes parameters, resolves file paths, instantiates RepoMap, and asynchronously calls its get_repo_map method to generate the repository map.async def repo_map( project_root: str, chat_files: Optional[List[str]] = None, other_files: Optional[List[str]] = None, token_limit: Any = 8192, # Accept any type to handle empty strings exclude_unranked: bool = False, force_refresh: bool = False, mentioned_files: Optional[List[str]] = None, mentioned_idents: Optional[List[str]] = None, verbose: bool = False, max_context_window: Optional[int] = None, ) -> Dict[str, Any]: """Generate a repository map for the specified files, providing a list of function prototypes and variables for files as well as relevant related files. Provide filenames relative to the project_root. In addition to the files provided, relevant related files will also be included with a very small ranking boost. :param project_root: Root directory of the project to search. (must be an absolute path!) :param chat_files: A list of file paths that are currently in the chat context. These files will receive the highest ranking. :param other_files: A list of other relevant file paths in the repository to consider for the map. They receive a lower ranking boost than mentioned_files and chat_files. :param token_limit: The maximum number of tokens the generated repository map should occupy. Defaults to 8192. :param exclude_unranked: If True, files with a PageRank of 0.0 will be excluded from the map. Defaults to False. :param force_refresh: If True, forces a refresh of the repository map cache. Defaults to False. :param mentioned_files: Optional list of file paths explicitly mentioned in the conversation and receive a mid-level ranking boost. :param mentioned_idents: Optional list of identifiers explicitly mentioned in the conversation, to boost their ranking. :param verbose: If True, enables verbose logging for the RepoMap generation process. Defaults to False. :param max_context_window: Optional maximum context window size for token calculation, used to adjust map token limit when no chat files are provided. :returns: A dictionary containing: - 'map': the generated repository map string - 'report': a dictionary with file processing details including: - 'included': list of processed files - 'excluded': dictionary of excluded files with reasons - 'definition_matches': count of matched definitions - 'reference_matches': count of matched references - 'total_files_considered': total files processed Or an 'error' key if an error occurred. """ if not os.path.isdir(project_root): return {"error": f"Project root directory not found: {project_root}"} # 1. Handle and validate parameters # Convert token_limit to integer with fallback try: token_limit = int(token_limit) if token_limit else 8192 except (TypeError, ValueError): token_limit = 8192 # Ensure token_limit is positive if token_limit <= 0: token_limit = 8192 chat_files_list = chat_files or [] mentioned_fnames_set = set(mentioned_files) if mentioned_files else None mentioned_idents_set = set(mentioned_idents) if mentioned_idents else None # 2. If a specific list of other_files isn't provided, scan the whole root directory. # This should happen regardless of whether chat_files are present. effective_other_files = [] if other_files: effective_other_files = other_files else: log.info("No other_files provided, scanning root directory for context...") effective_other_files = find_src_files(project_root) # Add a print statement for debugging so you can see what the tool is working with. log.debug(f"Chat files: {chat_files_list}") log.debug(f"Effective other_files count: {len(effective_other_files)}") # If after all that we have no files, we can exit early. if not chat_files_list and not effective_other_files: log.info("No files to process.") return {"map": "No files found to generate a map."} # 3. Resolve paths relative to project root root_path = Path(project_root).resolve() abs_chat_files = [str(root_path / f) for f in chat_files_list] abs_other_files = [str(root_path / f) for f in effective_other_files] # Remove any chat files from the other_files list to avoid duplication abs_chat_files_set = set(abs_chat_files) abs_other_files = [f for f in abs_other_files if f not in abs_chat_files_set] # 4. Instantiate and run RepoMap try: repo_mapper = RepoMap( map_tokens=token_limit, root=str(root_path), token_counter_func=lambda text: count_tokens(text, "gpt-4"), file_reader_func=read_text, output_handler_funcs={'info': log.info, 'warning': log.warning, 'error': log.error}, verbose=verbose, exclude_unranked=exclude_unranked, max_context_window=max_context_window ) except Exception as e: log.exception(f"Failed to initialize RepoMap for project '{project_root}': {e}") return {"error": f"Failed to initialize RepoMap: {str(e)}"} try: map_content, file_report = await asyncio.to_thread( repo_mapper.get_repo_map, chat_files=abs_chat_files, other_files=abs_other_files, mentioned_fnames=mentioned_fnames_set, mentioned_idents=mentioned_idents_set, force_refresh=force_refresh ) # Convert FileReport to dictionary for JSON serialization report_dict = { "excluded": file_report.excluded, "definition_matches": file_report.definition_matches, "reference_matches": file_report.reference_matches, "total_files_considered": file_report.total_files_considered } return { "map": map_content or "No repository map could be generated.", "report": report_dict } except Exception as e: log.exception(f"Error generating repository map for project '{project_root}': {e}") return {"error": f"Error generating repository map: {str(e)}"}
- repomap_server.py:51-53 (registration)Registration of the MCP server and the 'repo_map' tool using FastMCP decorator.mcp = FastMCP("RepoMapServer") @mcp.tool()
- repomap_class.py:557-617 (helper)Core implementation in RepoMap class: generates ranked repository map using PageRank on Tree-sitter extracted tags, with token limit adjustment and caching.def get_repo_map( self, chat_files: List[str] = None, other_files: List[str] = None, mentioned_fnames: Optional[Set[str]] = None, mentioned_idents: Optional[Set[str]] = None, force_refresh: bool = False ) -> Tuple[Optional[str], FileReport]: """Generate the repository map with file report.""" if chat_files is None: chat_files = [] if other_files is None: other_files = [] # Create empty report for error cases empty_report = FileReport({}, 0, 0, 0) if self.max_map_tokens <= 0 or not other_files: return None, empty_report # Adjust max_map_tokens if no chat files max_map_tokens = self.max_map_tokens if not chat_files and self.max_context_window: padding = 1024 available = self.max_context_window - padding max_map_tokens = min( max_map_tokens * self.map_mul_no_files, available ) try: # get_ranked_tags_map returns (map_string, file_report) map_string, file_report = self.get_ranked_tags_map( chat_files, other_files, max_map_tokens, mentioned_fnames, mentioned_idents, force_refresh ) except RecursionError: self.output_handlers['error']("Disabling repo map, git repo too large?") self.max_map_tokens = 0 return None, FileReport({}, 0, 0, 0) # Ensure consistent return type if map_string is None: print("map_string is None") return None, file_report if self.verbose: tokens = self.token_count(map_string) self.output_handlers['info'](f"Repo-map: {tokens / 1024:.1f} k-tokens") # Format final output other = "other " if chat_files else "" if self.repo_content_prefix: repo_content = self.repo_content_prefix.format(other=other) else: repo_content = "" repo_content += map_string return repo_content, file_report
- repomap_class.py:254-401 (helper)Helper method computing PageRank scores for files and tags based on definition/reference graph.def get_ranked_tags( self, chat_fnames: List[str], other_fnames: List[str], mentioned_fnames: Optional[Set[str]] = None, mentioned_idents: Optional[Set[str]] = None ) -> Tuple[List[Tuple[float, Tag]], FileReport]: """Get ranked tags using PageRank algorithm with file report.""" # Return empty list and empty report if no files if not chat_fnames and not other_fnames: return [], FileReport([], {}, 0, 0, 0) # Initialize file report early included: List[str] = [] excluded: Dict[str, str] = {} total_definitions = 0 total_references = 0 if mentioned_fnames is None: mentioned_fnames = set() if mentioned_idents is None: mentioned_idents = set() # Normalize paths to absolute def normalize_path(path): return str(Path(path).resolve()) chat_fnames = [normalize_path(f) for f in chat_fnames] other_fnames = [normalize_path(f) for f in other_fnames] # Initialize file report included: List[str] = [] excluded: Dict[str, str] = {} input_files: Dict[str, Dict] = {} total_definitions = 0 total_references = 0 # Collect all tags defines = defaultdict(set) references = defaultdict(set) definitions = defaultdict(set) personalization = {} chat_rel_fnames = set(self.get_rel_fname(f) for f in chat_fnames) all_fnames = list(set(chat_fnames + other_fnames)) for fname in all_fnames: rel_fname = self.get_rel_fname(fname) if not os.path.exists(fname): reason = "File not found" excluded[fname] = reason self.output_handlers['warning'](f"Repo-map can't include {fname}: {reason}") continue included.append(fname) tags = self.get_tags(fname, rel_fname) for tag in tags: if tag.kind == "def": defines[tag.name].add(rel_fname) definitions[rel_fname].add(tag.name) total_definitions += 1 elif tag.kind == "ref": references[tag.name].add(rel_fname) total_references += 1 # Set personalization for chat files if fname in chat_fnames: personalization[rel_fname] = 100.0 # Build graph G = nx.MultiDiGraph() # Add nodes for fname in all_fnames: rel_fname = self.get_rel_fname(fname) G.add_node(rel_fname) # Add edges based on references for name, ref_fnames in references.items(): def_fnames = defines.get(name, set()) for ref_fname in ref_fnames: for def_fname in def_fnames: if ref_fname != def_fname: G.add_edge(ref_fname, def_fname, name=name) if not G.nodes(): return [], file_report # Run PageRank try: if personalization: ranks = nx.pagerank(G, personalization=personalization, alpha=0.85) else: ranks = {node: 1.0 for node in G.nodes()} except: # Fallback to uniform ranking ranks = {node: 1.0 for node in G.nodes()} # Update excluded dictionary with status information for fname in set(chat_fnames + other_fnames): if fname in excluded: # Add status prefix to existing exclusion reason excluded[fname] = f"[EXCLUDED] {excluded[fname]}" elif fname not in included: excluded[fname] = "[NOT PROCESSED] File not included in final processing" # Create file report file_report = FileReport( excluded=excluded, definition_matches=total_definitions, reference_matches=total_references, total_files_considered=len(all_fnames) ) # Collect and rank tags ranked_tags = [] for fname in included: rel_fname = self.get_rel_fname(fname) file_rank = ranks.get(rel_fname, 0.0) # Exclude files with low Page Rank if exclude_unranked is True if self.exclude_unranked and file_rank <= 0.0001: # Use a small threshold to exclude near-zero ranks continue tags = self.get_tags(fname, rel_fname) for tag in tags: if tag.kind == "def": # Boost for mentioned identifiers boost = 1.0 if tag.name in mentioned_idents: boost *= 10.0 if rel_fname in mentioned_fnames: boost *= 5.0 if rel_fname in chat_rel_fnames: boost *= 20.0 final_rank = file_rank * boost ranked_tags.append((final_rank, tag)) # Sort by rank (descending) ranked_tags.sort(key=lambda x: x[0], reverse=True) return ranked_tags, file_report
- repomap_server.py:66-89 (schema)Tool schema defined via docstring parameters and return description, used by FastMCP."""Generate a repository map for the specified files, providing a list of function prototypes and variables for files as well as relevant related files. Provide filenames relative to the project_root. In addition to the files provided, relevant related files will also be included with a very small ranking boost. :param project_root: Root directory of the project to search. (must be an absolute path!) :param chat_files: A list of file paths that are currently in the chat context. These files will receive the highest ranking. :param other_files: A list of other relevant file paths in the repository to consider for the map. They receive a lower ranking boost than mentioned_files and chat_files. :param token_limit: The maximum number of tokens the generated repository map should occupy. Defaults to 8192. :param exclude_unranked: If True, files with a PageRank of 0.0 will be excluded from the map. Defaults to False. :param force_refresh: If True, forces a refresh of the repository map cache. Defaults to False. :param mentioned_files: Optional list of file paths explicitly mentioned in the conversation and receive a mid-level ranking boost. :param mentioned_idents: Optional list of identifiers explicitly mentioned in the conversation, to boost their ranking. :param verbose: If True, enables verbose logging for the RepoMap generation process. Defaults to False. :param max_context_window: Optional maximum context window size for token calculation, used to adjust map token limit when no chat files are provided. :returns: A dictionary containing: - 'map': the generated repository map string - 'report': a dictionary with file processing details including: - 'included': list of processed files - 'excluded': dictionary of excluded files with reasons - 'definition_matches': count of matched definitions - 'reference_matches': count of matched references - 'total_files_considered': total files processed Or an 'error' key if an error occurred. """