Skip to main content
Glama
TheAlchemist6

CodeCompass MCP

get_file_content

Retrieve and process file contents from GitHub repositories with batch processing, metadata extraction, and configurable options for development workflows.

Instructions

📁 Retrieve content of specific files with smart truncation and batch processing capabilities.

⚠️ FEATURES: • Batch processing with concurrent file retrieval • Automatic file validation and security checks • Rich metadata extraction (file type, language, size, line count) • Configurable processing limits and error handling • Support for multiple file formats with type detection

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesGitHub repository URL
file_pathsYesPaths to files to retrieve (supports batch processing)
optionsNo

Implementation Reference

  • The main handler function for the 'get_file_content' tool. Validates input file paths, fetches contents concurrently from GitHub using GitHubService, processes files in batch with metadata extraction and truncation, handles errors, and returns a structured response with file contents, summaries, and statistics.
    async function handleGetFileContent(args: any) {
      try {
        const { url, file_paths, options = {} } = args;
        
        // Validate file paths first
        const pathValidationErrors: string[] = [];
        for (const filePath of file_paths) {
          const validation = validateFilePath(filePath);
          if (!validation.valid) {
            pathValidationErrors.push(`${filePath}: ${validation.error}`);
          }
        }
        
        if (pathValidationErrors.length > 0) {
          throw new Error(`Invalid file paths detected:\n${pathValidationErrors.join('\n')}`);
        }
        
        // Fetch file contents from GitHub
        const fileContents: Array<{ path: string; content: string }> = [];
        const fetchErrors: Record<string, string> = {};
        
        for (const filePath of file_paths) {
          try {
            const content = await githubService.getFileContent(url, filePath);
            fileContents.push({ path: filePath, content });
          } catch (error: any) {
            fetchErrors[filePath] = error.message;
          }
        }
        
        // Process files using batch processing
        const batchOptions = {
          maxConcurrent: options.max_concurrent || config.limits.maxConcurrentRequests,
          continueOnError: options.continue_on_error !== false,
          validatePaths: false, // Already validated above
          includeMetadata: options.include_metadata !== false,
          maxFileSize: options.max_size || config.limits.maxFileSize,
          allowedExtensions: options.file_extensions,
          excludePatterns: options.exclude_patterns,
        };
        
        const batchResult = await batchProcessFiles(fileContents, batchOptions);
        
        // Combine results with fetch errors
        const results: Record<string, any> = {};
        
        // Add successful and failed processing results
        batchResult.results.forEach(result => {
          if (result.success) {
            results[result.filePath] = {
              content: result.content,
              metadata: result.metadata,
              size: result.metadata?.size || 0,
              truncated: result.metadata?.size ? result.metadata.size > (options.max_size || config.limits.maxFileSize) : false,
            };
          } else {
            results[result.filePath] = {
              error: result.error?.message || 'Processing failed',
              details: result.error?.details,
            };
          }
        });
        
        // Add fetch errors
        Object.entries(fetchErrors).forEach(([filePath, error]) => {
          results[filePath] = {
            error: `Failed to fetch: ${error}`,
          };
        });
        
        // Add processing statistics
        const statistics = getFileStatistics(batchResult.results.filter(r => r.success));
        
        const response = createResponse({
          files: results,
          summary: {
            ...batchResult.summary,
            fetchErrors: Object.keys(fetchErrors).length,
            statistics,
          },
        });
        
        return formatToolResponse(response);
      } catch (error) {
        const response = createResponse(null, error, { tool: 'get_file_content', url: args.url });
        return formatToolResponse(response);
      }
  • Tool registration definition including name, detailed description, and complete input schema for validation. This object is exported as part of consolidatedTools and served via ListToolsRequestSchema handler.
    {
      name: 'get_file_content',
      description: '📁 Retrieve content of specific files with smart truncation and batch processing capabilities.\n\n⚠️ FEATURES:\n• Batch processing with concurrent file retrieval\n• Automatic file validation and security checks\n• Rich metadata extraction (file type, language, size, line count)\n• Configurable processing limits and error handling\n• Support for multiple file formats with type detection',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'GitHub repository URL',
          },
          file_paths: {
            type: 'array',
            items: { type: 'string' },
            description: 'Paths to files to retrieve (supports batch processing)',
          },
          options: {
            type: 'object',
            properties: {
              max_size: {
                type: 'number',
                description: 'Maximum file size in bytes',
                default: 100000,
              },
              include_metadata: {
                type: 'boolean',
                description: 'Include file metadata (size, modified date, etc.)',
                default: false,
              },
              truncate_large_files: {
                type: 'boolean',
                description: 'Truncate files larger than max_size',
                default: true,
              },
              max_concurrent: {
                type: 'number',
                description: 'Maximum concurrent file processing',
                default: 5,
                minimum: 1,
                maximum: 20,
              },
              continue_on_error: {
                type: 'boolean',
                description: 'Continue processing other files if one fails',
                default: true,
              },
              file_extensions: {
                type: 'array',
                items: { type: 'string' },
                description: 'Only process files with these extensions (e.g., [".js", ".ts"])',
              },
              exclude_patterns: {
                type: 'array',
                items: { type: 'string' },
                description: 'Exclude files matching these regex patterns',
              },
              format: {
                type: 'string',
                enum: ['raw', 'parsed', 'summary'],
                description: 'Format for file content',
                default: 'raw',
              },
            },
          },
        },
        required: ['url', 'file_paths'],
      },
  • Detailed input schema defining parameters for the tool: repository URL, array of file paths, and extensive options for processing limits, metadata, truncation, concurrency, filtering, and output format.
    inputSchema: {
      type: 'object',
      properties: {
        url: {
          type: 'string',
          description: 'GitHub repository URL',
        },
        file_paths: {
          type: 'array',
          items: { type: 'string' },
          description: 'Paths to files to retrieve (supports batch processing)',
        },
        options: {
          type: 'object',
          properties: {
            max_size: {
              type: 'number',
              description: 'Maximum file size in bytes',
              default: 100000,
            },
            include_metadata: {
              type: 'boolean',
              description: 'Include file metadata (size, modified date, etc.)',
              default: false,
            },
            truncate_large_files: {
              type: 'boolean',
              description: 'Truncate files larger than max_size',
              default: true,
            },
            max_concurrent: {
              type: 'number',
              description: 'Maximum concurrent file processing',
              default: 5,
              minimum: 1,
              maximum: 20,
            },
            continue_on_error: {
              type: 'boolean',
              description: 'Continue processing other files if one fails',
              default: true,
            },
            file_extensions: {
              type: 'array',
              items: { type: 'string' },
              description: 'Only process files with these extensions (e.g., [".js", ".ts"])',
            },
            exclude_patterns: {
              type: 'array',
              items: { type: 'string' },
              description: 'Exclude files matching these regex patterns',
            },
            format: {
              type: 'string',
              enum: ['raw', 'parsed', 'summary'],
              description: 'Format for file content',
              default: 'raw',
            },
          },
        },
      },
      required: ['url', 'file_paths'],
  • Core helper method in GitHubService that retrieves individual file content from GitHub repository using Octokit REST API. Includes caching, retry logic with rate limit handling, base64 decoding, and comprehensive error handling for 404s and rate limits.
    async getFileContent(url: string, filePath: string): Promise<string> {
      const { owner, repo } = this.parseGitHubUrl(url);
      const cacheKey = this.getCacheKey('getFileContent', { owner, repo, filePath });
      
      // Check cache first
      const cached = this.getCachedResult<string>(cacheKey);
      if (cached) {
        return cached;
      }
    
      try {
        const { data } = await this.withRetry(() => 
          this.octokit.rest.repos.getContent({
            owner,
            repo,
            path: filePath,
          })
        );
    
        if ('content' in data) {
          const content = Buffer.from(data.content, 'base64').toString('utf-8');
          this.setCachedResult(cacheKey, content);
          return content;
        }
        
        throw new Error('File content not available');
      } catch (error: any) {
        if (error.status === 404) {
          throw new Error(`File not found: ${filePath}`);
        }
        if (error.status === 403 && error.message.includes('rate limit')) {
          throw new Error(`GitHub API rate limit exceeded. Please provide a GitHub token for higher limits. Error: ${error.message}`);
        }
        throw new Error(`Failed to fetch file content: ${error.message}`);
      }
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses several behavioral traits: smart truncation, batch processing, automatic validation/security checks, metadata extraction, configurable limits, error handling, and format support. However, it lacks details on authentication requirements, rate limits, error types, or what 'smart truncation' specifically entails.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose but uses a feature-list format that's somewhat redundant. Sentences like 'Automatic file validation and security checks' and 'Configurable processing limits and error handling' could be more integrated. The emoji and formatting add visual noise without substantive value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 3 parameters (including a complex nested object), no annotations, and no output schema, the description provides moderate context. It covers key capabilities but lacks details on return format, error responses, authentication, and specific limitations. Given the complexity, it should do more to guide usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67%, and the description adds little parameter-specific information beyond what's in the schema. It mentions 'batch processing' (implied by file_paths array) and 'configurable processing limits' (implied by options.max_concurrent), but doesn't explain parameter interactions or provide examples. The description doesn't compensate for the 33% coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves file content with specific capabilities (smart truncation, batch processing). It distinguishes from siblings like get_file_tree (structure) or analyze_codebase (analysis) by focusing on content retrieval. However, it doesn't explicitly contrast with all siblings like search_repository.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists features but provides no guidance on when to use this tool versus alternatives. It doesn't mention when to choose get_file_content over get_file_tree for file exploration or search_repository for finding files. No explicit when/when-not instructions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/TheAlchemist6/codecompass-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server