Skip to main content
Glama
pathakkhhimanshu

AI Dev Assistant

github_repo_reader

Reads source files from local repositories, excluding Git files, node_modules, binaries, and large files. Returns directory structure and file contents for development analysis.

Instructions

Recursively reads all source files in a local repository directory. Automatically ignores .git, node_modules, binary files, and large files (>500KB). Returns a directory tree and the full content of each readable file.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
repo_pathYesAbsolute path to the local repository root. Windows example: C:\Users\YourName\Projects\my-repo
max_filesNoMaximum number of files to return (default: 100, max: 500).

Implementation Reference

  • The handler function for the github_repo_reader tool. It reads the local repository, constructs a directory tree, and collects file contents while applying ignore rules.
    async handler(args: { repo_path: string; max_files?: number }): Promise<string> {
      const repoPath = path.resolve(args.repo_path);
      const maxFiles = Math.min(args.max_files ?? 100, 500);
    
      if (!fs.existsSync(repoPath)) {
        return `ERROR: Path does not exist: ${repoPath}`;
      }
    
      const stat = fs.statSync(repoPath);
      if (!stat.isDirectory()) {
        return `ERROR: Path is not a directory: ${repoPath}`;
      }
    
      const tree = buildTree(repoPath);
      const files: FileEntry[] = [];
      const skipped = { count: 0 };
    
      collectFiles(repoPath, repoPath, files, skipped);
    
      const limitedFiles = files.slice(0, maxFiles);
      if (files.length > maxFiles) {
        skipped.count += files.length - maxFiles;
      }
    
      const result: RepoReadResult = {
        repoPath,
        totalFiles: limitedFiles.length,
        skippedFiles: skipped.count,
        files: limitedFiles,
        tree,
      };
    
      const sections: string[] = [
        `# Repository: ${path.basename(repoPath)}`,
        `**Path:** ${result.repoPath}`,
        `**Files Read:** ${result.totalFiles}  |  **Skipped:** ${result.skippedFiles}`,
        `\n## Directory Tree\n\`\`\`\n${result.tree}\n\`\`\``,
        `\n## File Contents`,
      ];
    
      for (const file of result.files) {
        const ext = path.extname(file.relativePath).slice(1) || "txt";
        sections.push(
          `\n### ${file.relativePath}\n` +
          `*Size: ${(file.sizeBytes / 1024).toFixed(1)} KB*\n` +
          `\`\`\`${ext}\n${file.content}\n\`\`\``
        );
      }
    
      return sections.join("\n");
    },
  • Input schema definition for the github_repo_reader tool, specifying required and optional parameters.
    inputSchema: {
      type: "object",
      properties: {
        repo_path: {
          type: "string",
          description:
            "Absolute path to the local repository root. " +
            "Windows example: C:\\Users\\YourName\\Projects\\my-repo",
        },
        max_files: {
          type: "number",
          description: "Maximum number of files to return (default: 100, max: 500).",
        },
      },
      required: ["repo_path"],
    },
  • Tool metadata including name and description for the github_repo_reader tool.
    return {
      name: "github_repo_reader",
      description:
        "Recursively reads all source files in a local repository directory. " +
        "Automatically ignores `.git`, `node_modules`, binary files, and large files (>500KB). " +
        "Returns a directory tree and the full content of each readable file.",
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behaviors: recursive traversal, automatic ignoring of specific directories and file types, handling of binary and large files, and the return format (directory tree and file contents). This gives the agent clear expectations without contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by essential behavioral details in a compact form. Every sentence adds value (e.g., filtering rules and return format), with no redundant or vague language, making it highly efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (recursive file reading with filters), no annotations, and no output schema, the description does a good job covering behavior and output. It explains what is returned (directory tree and file contents) and key constraints, though it could benefit from mentioning error handling or performance implications for large repositories.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description does not add any additional meaning or context beyond what the schema provides for 'repo_path' and 'max_files', such as usage examples or constraints not in the schema. Baseline 3 is appropriate as the schema handles parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('reads', 'returns') and resources ('all source files in a local repository directory'). It distinguishes itself from potential siblings by specifying recursive reading with automatic filtering of common directories and file types, which is not implied by the tool name alone.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through details like 'local repository directory' and filtering of '.git', 'node_modules', etc., suggesting it's for analyzing codebases. However, it does not explicitly state when to use this tool versus alternatives like 'doc_search' or 'code_executor', nor does it provide exclusions or prerequisites for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pathakkhhimanshu/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server