Analyzes C++ repositories to generate structured maps of the codebase, highlighting important definitions and code relationships
Maps Dart codebases to create navigable representations of code structure and important elements
Generates repository maps for Elixir codebases, highlighting key files and code structures
Provides code navigation capabilities for Elm repositories, identifying important files and code elements
Maps Go codebases to extract important code elements and show relationships between files and functions
Analyzes HCL files to generate repository maps highlighting important definitions and structures
Maps JavaScript repositories to identify key files and code structures, prioritizing them based on importance using PageRank
Creates structured maps of Kotlin repositories, identifying key files and code elements
Provides code navigation capabilities for Lua codebases, highlighting important files and functions
Maps OCaml repositories to show important files, code structures, and their relationships
Analyzes PHP codebases to identify important files and code elements, creating navigable repository maps
Analyzes Python codebases to generate repository maps that highlight important files, code definitions, and their relationships
Generates repository maps for Racket codebases, highlighting key files and code structures
Maps Ruby repositories to identify important files and code elements, showing their relationships
Creates structured maps of Rust codebases, highlighting important files and code elements
Provides repository mapping for Scala codebases, extracting important definitions and showing relationships
Maps Solidity codebases to identify important files and code structures, prioritizing them based on importance
Provides support for SQLite operations, including caching parsed code data for improved performance
Analyzes Swift repositories to generate navigable maps highlighting important code elements and relationships
Provides repository mapping for TypeScript codebases, extracting function/class definitions and showing relationships between code elements
RepoMap - Command-Line Tool and MCP Server
RepoMap is a powerful tool designed to help, primarily LLMs, understand and navigate complex codebases. It functions both as a command-line application for on-demand analysis and as an MCP (Model Context Protocol) server, providing continuous repository mapping capabilities to other applications. By generating a "map" of the software repository, RepoMap highlights important files, code definitions, and their relationships. It leverages Tree-sitter for accurate code parsing and the PageRank algorithm to rank code elements by importance, ensuring that the most relevant information is always prioritized.
Table of Contents
- Aider
- Example Output
- Features
- Installation
- Usage
- How It Works
- Output Format
- Dependencies
- Caching
- Supported Languages
- License
- Running as an MCP Server
Aider
RepoMap is 100% based on Aider's Repo map functionality, but I don't believe it shares any code with it. Allow me to explain.
My original effort was to take the RepoMap class from Aider, remove all the aider-specific dependencies, and then make it into a command-line tool. Python isn't my native language and I really struggled to get it to work.
So a few hours ago, I had a different idea. I took the RepoMap and some of its related code from aider and I fed it to an LLM (Either Claude or Gemini 2.5 Pro, can't remember) and had it create specifications for this, basically, from aider's implementation. So it generated a very detailed specification for this application (minus the MCP bits) and then I fed that to, well, Aider with Claude 3.7, and it built the command-line version of this.
I then used a combination of Aider w/Claude 3.7, Cline w/Gemini 2.5 Pro Preview & Gemini 2.5 Flash Preview, and Phind.com, and Gemini.com and Claude.com and ChatGPT.com and after a few hours, I finally got the MCP server sorted out. Again, keeping in mind, Python isn't really my native tongue.
Example Output
Features
- Smart Code Analysis: Uses Tree-sitter to parse source code and extract function/class definitions
- Relevance Ranking: Employs PageRank algorithm to rank code elements by importance
- Token-Aware: Respects token limits to fit within LLM context windows
- Caching: Persistent caching for fast subsequent runs
- Multi-Language: Supports Python, JavaScript, TypeScript, Java, C/C++, Go, Rust, and more
- Important File Detection: Automatically identifies and prioritizes important files (README, requirements.txt, etc.)
Installation
Usage
Basic Usage
The tool prioritizes files in the following order:
--chat-files
: These files are given the highest priority, as they're assumed to be the files you're currently working on.--mentioned-files
: These files are given a high priority, as they're explicitly mentioned in the current context.--other-files
: These files are given the lowest priority and are used to provide additional context.
Advanced Options
How It Works
- File Discovery: Scans the repository for source files
- Code Parsing: Uses Tree-sitter to parse code and extract definitions/references
- Graph Building: Creates a graph where files are nodes and symbol references are edges
- Ranking: Applies PageRank algorithm to rank files and symbols by importance
- Token Optimization: Uses binary search to fit the most important content within token limits
- Output Generation: Formats the results as a readable code map
Output Format
The tool generates a structured view of your codebase showing:
- File paths and important code sections
- Function and class definitions
- Key relationships between code elements
- Prioritized based on actual usage and references
Dependencies
tiktoken
: Token counting for various LLM modelsnetworkx
: Graph algorithms (PageRank)diskcache
: Persistent cachinggrep-ast
: Tree-sitter integration for code parsingtree-sitter
: Code parsing frameworkpygments
: Syntax highlighting and lexical analysis
Caching
The tool uses persistent caching to speed up subsequent runs:
- Cache directory:
.repomap.tags.cache.v1/
- Automatically invalidated when files change
- Can be cleared with
--force-refresh
Supported Languages
Currently supports languages with Tree-sitter grammars:
- arduino
- chatito
- commonlisp
- cpp
- csharp
- c
- dart
- d
- elisp
- elixir
- elm
- gleam
- go
- javascript
- java
- lua
- ocaml_interface
- ocaml
- pony
- properties
- python
- racket
- r
- ruby
- rust
- solidity
- swift
- udev
- c_sharp
- hcl
- kotlin
- php
- ql
- scala
License
This implementation is based on the RepoMap design from the Aider project.
Running as an MCP Server
RepoMap can also be run as an MCP (Model Context Protocol) server, allowing other applications to access its repository mapping capabilities.
Setup
- Ensure you've added all your projects to the
projects.json
file (located in the same folder asrepomap
), specifying the full path to the root directory for each project. Giving the project a good description will help the LLM figure out which project to use without you having to tell it. This, ultimately, allows the MCP server to locate your code. - The RepoMap MCP server uses STDIO (standard input/output) for communication. No additional configuration is required for the transport layer.
- To set up RepoMap as an MCP server with Cline (or similar tools like Roo), add the following configuration to your Cline settings file (e.g.,
cline_mcp_settings.json
):- Replace
"/mnt/programming/RepoMapper/repomap_server.py"
with the actual path to yourrepomap_server.py
file.
- Replace
Usage
- Run the
repomap_server.py
script:Bash - The server will start and listen for requests via STDIO.
- Other applications can then use the
repo_map
tool provided by the server to generate repository maps. They'll need to specify theproject_name
corresponding to a project defined in yourprojects.json
file.
Example projects.json
Make sure the root
paths in projects.json
are absolute paths to your projects.
This server cannot be installed
A powerful tool designed to help, primarily LLMs, understand and navigate complex codebases. It functions both as a command-line application for on-demand analysis and as an MCP (Model Context Protocol) server, providing continuous repository mapping capabilities to other applications. By generating
Related MCP Servers
- -securityFlicense-qualityA server component of the Model Context Protocol that provides intelligent analysis of codebases using vector search and machine learning to understand code patterns, architectural decisions, and documentation.Last updated -4Python
- -securityAlicense-qualityAn MCP server that analyzes codebases and generates contextual prompts, making it easier for AI assistants to understand and work with code repositories.Last updated -10PythonMIT License
- -securityFlicense-qualityAllows LLM tools like Claude Desktop and Cursor AI to access and summarize code files through a Model Context Protocol server, providing structured access to codebase content without manual copying.Last updated -TypeScript
- AsecurityAlicenseAqualityAn MCP server that scans codebases to extract structural information (classes, functions, etc.) with flexible filtering options and outputs in LLM-friendly formats.Last updated -12JavaScriptGPL 3.0