get_github_repo
Extract and process code from a GitHub repository branch into text format, enabling structured and context-preserving input for LLM analysis and processing.
Instructions
Process and return the code from a GitHub repository branch as text
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| branch | No | master | |
| repo_url | Yes |
Implementation Reference
- mcp-repo2llm-server.py:33-33 (registration)Registration of the get_github_repo tool using the @mcp.tool() decorator from FastMCP.@mcp.tool()
- mcp-repo2llm-server.py:34-54 (handler)The handler function for the get_github_repo tool. It instantiates GithubRepo2Txt, runs its process_repo method in an asyncio executor with a 3000s timeout, and handles errors.async def get_github_repo(repo_url: str, branch: str = "master")->str: """ Process and return the code from a GitHub repository branch as text """ try: # Create an event loop loop = asyncio.get_event_loop() # Wrap synchronous operation in async operation with 300 seconds (5 minutes) timeout repo_processor = GithubRepo2Txt() repo_name, content = await asyncio.wait_for( loop.run_in_executor(None, repo_processor.process_repo, repo_url, branch), timeout=3000 ) # logger.info(f"Processed GitLab repository: {repo_name}") return content except asyncio.TimeoutError: return "Processing timeout, please check repository size or network connection" except Exception as e: # logger.error(f"Error processing GitLab repository: {e}") return f"Processing failed: {str(e)}"
- repo2llm/githubrepo2txt.py:120-155 (helper)Core helper method that processes the GitHub repository: extracts README, generates file structure, downloads and decodes text file contents (skipping binaries), and formats everything into a single string.def process_repo(self, repo_url, branch='master'): """ 处理GitHub仓库并返回处理后的内容 Args: repo_url (str): GitHub仓库URL branch (str, optional): 分支名称. 默认为 'master' Returns: tuple: (repo_name, content_string) - 仓库名和处理后的内容字符串 """ repo_name = repo_url.split('/')[-1] repo = self.github.get_repo(repo_url.replace('https://github.com/', '')) # print(f"Getting {repo_name}'s README") readme_content = self._get_readme_content(repo, branch) # print(f"\nGetting {repo_name}'s repo structure") repo_structure = f"repo structure: {repo_name}\n" repo_structure += self._traverse_repo_iteratively(repo, branch) # print(f"\nGetting {repo_name}'s file") file_contents = self._get_file_contents_iteratively(repo, branch) instructions = "Please analyze using the following provided files and contents:\n\n" # 组合所有内容 content = ( instructions + f"README:\n{readme_content}\n\n" + repo_structure + '\n\n' + file_contents ) return repo_name, content