Skip to main content
Glama
puravparab

Gitingest-MCP

by puravparab

git_files

Retrieve file contents from a GitHub repository by specifying the owner, repository name, file paths, and optional branch using this tool for efficient data extraction.

Instructions

Get the content of specific files from a GitHub repository

Args:
	owner: The GitHub organization or username
	repo: The repository name
	file_paths: List of paths to files within the repository
	branch: Optional branch name (default: None)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
branchNo
file_pathsYes
ownerYes
repoYes

Implementation Reference

  • The primary handler for the 'git_files' tool. It constructs the repo URL, initializes GitIngester, fetches repo data, and retrieves contents for the specified file_paths using the helper method.
    @mcp.tool()
    async def git_files(
    	owner: str,
    	repo: str, 
    	file_paths: List[str],
    	branch: Optional[str] = None
    ) -> Union[str, Dict[str, str]]:
    	"""
    	Get the content of specific files from a GitHub repository
    
    	Args:
    		owner: The GitHub organization or username
    		repo: The repository name
    		file_paths: List of paths to files within the repository
    		branch: Optional branch name (default: None)
    	"""
    	url = f"https://github.com/{owner}/{repo}"
    
    	try:
    		# Create GitIngester and fetch data asynchronously
    		ingester = GitIngester(url, branch=branch)
    		await ingester.fetch_repo_data()
    		
    		# Get the requested file contents
    		files_content = ingester.get_content(file_paths)
    		if not files_content:
    			return {
    				"error": f"None of the requested files were found in the repository"
    			}
    		return files_content
    		
    	except Exception as e:
    		return {
    			"error": f"Failed to get file content: {str(e)}. Try https://gitingest.com/{url} instead"
    		}
  • GitIngester.get_content() helper method called by the git_files handler to extract and format content for the specified file paths from the parsed repository content using regex patterns.
    def get_content(self, file_paths: Optional[List[str]] = None) -> str:
    	"""Returns the repository content."""
    	if file_paths is None:
    		return self.content
    	return self._get_files_content(file_paths)
    
    def _get_files_content(self, file_paths: List[str]) -> str:
    	"""Helper function to extract specific files from repository content."""
    	result = {}
    	for path in file_paths:
    		result[path] = None
    	if not self.content:
    		return result
    	# Get the content as a string
    	content_str = str(self.content)
    
    	# Try multiple patterns to match file content sections
    	patterns = [
    		# Standard pattern with exactly 50 equals signs
    		r"={50}\nFile: ([^\n]+)\n={50}",
    		# More flexible pattern with varying number of equals signs
    		r"={10,}\nFile: ([^\n]+)\n={10,}",
    		# Extra flexible pattern
    		r"=+\s*File:\s*([^\n]+)\s*\n=+",
    	]
    
    	for pattern in patterns:
    		# Find all matches in the content
    		matches = re.finditer(pattern, content_str)
    		matched = False
    		for match in matches:
    			matched = True
    			# Get the position of the match
    			start_pos = match.end()
    			filename = match.group(1).strip()
    			# Find the next file header or end of string
    			next_match = re.search(pattern, content_str[start_pos:])
    			if next_match:
    				end_pos = start_pos + next_match.start()
    				file_content = content_str[start_pos:end_pos].strip()
    			else:
    				file_content = content_str[start_pos:].strip()
    
    			# Check if this file matches any of the requested paths
    			for path in file_paths:
    				basename = path.split("/")[-1]
    				if path == filename or basename == filename or path.endswith("/" + filename):
    					result[path] = file_content
    		
    		# If we found matches with this pattern, no need to try others
    		if matched:
    			break
    
    	# Concatenate all found file contents with file headers
    	concatenated = ""
    	for path, content in result.items():
    		if content is not None:
    			if concatenated:
    				concatenated += "\n\n"
    			concatenated += f"==================================================\nFile: {path}\n==================================================\n{content}"
    	return concatenated
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it states the tool gets file content, it doesn't describe important behavioral traits such as authentication requirements, rate limits, error handling (e.g., for non-existent files), response format, or whether it's read-only. The description is minimal and lacks critical operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence clearly states the purpose, followed by a structured Args section. There's minimal waste, though the Args formatting could be more integrated. It efficiently conveys key information without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, no annotations, no output schema), the description is incomplete. It lacks details on authentication, error handling, response format, and usage context relative to siblings. For a tool that interacts with external APIs and has multiple parameters, more comprehensive guidance is needed to ensure correct agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It lists parameters (owner, repo, file_paths, branch) with brief explanations in the Args section, adding meaning beyond the bare schema. However, it doesn't provide detailed semantics like format examples (e.g., file_paths as array of strings), constraints, or default behaviors beyond 'Optional branch name (default: None)'. This partial compensation justifies a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get the content of specific files from a GitHub repository.' This specifies the verb ('Get'), resource ('content of specific files'), and context ('from a GitHub repository'). However, it doesn't explicitly differentiate from sibling tools like git_summary or git_tree, which might provide summaries or directory structures rather than file contents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools (git_summary, git_tree) or explain scenarios where this tool is preferred over others. The only implied usage is retrieving file contents, but no explicit context or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/puravparab/Gitingest-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server