Skip to main content
Glama

read_large_file_chunk

Read specific sections of large files using intelligent chunking that automatically determines optimal size based on file type, enabling efficient processing without loading entire files into memory.

Instructions

Read a specific chunk of a large file with intelligent chunking based on file type. Automatically determines optimal chunk size.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYesAbsolute path to the file
chunkIndexNoZero-based chunk index to read (default: 0)
linesPerChunkNoNumber of lines per chunk (optional, auto-detected if not provided)
includeLineNumbersNoInclude line numbers in output (default: false)

Implementation Reference

  • The primary handler function `readChunk` that implements the core tool logic: verifies the file exists, computes metadata and optimal chunk size based on file type, calculates start/end lines with overlap, reads the line range, formats content with optional line numbers, and returns a structured `FileChunk`.
    static async readChunk(
      filePath: string,
      chunkIndex: number,
      options: ChunkOptions = {}
    ): Promise<FileChunk> {
      await this.verifyFile(filePath);
    
      const metadata = await this.getMetadata(filePath);
      const linesPerChunk = options.linesPerChunk ||
        this.getOptimalChunkSize(metadata.fileType, metadata.totalLines);
      const overlapLines = options.overlapLines || 10;
    
      const startLine = Math.max(1, chunkIndex * linesPerChunk - overlapLines + 1);
      const endLine = Math.min(metadata.totalLines, (chunkIndex + 1) * linesPerChunk);
    
      const lines = await this.readLines(filePath, startLine, endLine);
      const content = options.includeLineNumbers
        ? lines.map((line, idx) => `${startLine + idx}: ${line}`).join('\n')
        : lines.join('\n');
    
      const totalChunks = Math.ceil(metadata.totalLines / linesPerChunk);
    
      return {
        content,
        startLine,
        endLine,
        totalLines: metadata.totalLines,
        chunkIndex,
        totalChunks,
        filePath,
        byteOffset: 0, // Calculated if needed
        byteSize: Buffer.byteLength(content, 'utf-8'),
      };
    }
  • The MCP server-specific handler `handleReadChunk` that parses tool arguments, applies caching via CacheManager, delegates to FileHandler.readChunk, and formats the JSON response for MCP protocol.
    private async handleReadChunk(
      args: Record<string, unknown>
    ): Promise<{ content: Array<{ type: string; text: string }> }> {
      const filePath = args.filePath as string;
      const chunkIndex = (args.chunkIndex as number) || 0;
      const linesPerChunk = args.linesPerChunk as number | undefined;
      const includeLineNumbers = (args.includeLineNumbers as boolean) || false;
    
      const cacheKey = `chunk:${filePath}:${chunkIndex}:${linesPerChunk}:${includeLineNumbers}`;
      let chunk = this.chunkCache.get(cacheKey);
    
      if (!chunk) {
        chunk = await FileHandler.readChunk(filePath, chunkIndex, {
          linesPerChunk,
          includeLineNumbers,
          overlapLines: this.config.defaultOverlap,
        });
        this.chunkCache.set(cacheKey, chunk);
      }
    
      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify(chunk, null, 2),
          },
        ],
      };
    }
  • src/server.ts:90-115 (registration)
    Registers the 'read_large_file_chunk' tool with the MCP SDK Server by defining its name, description, and inputSchema in the tools list returned by getTools().
    {
      name: 'read_large_file_chunk',
      description: 'Read a specific chunk of a large file with intelligent chunking based on file type. Automatically determines optimal chunk size.',
      inputSchema: {
        type: 'object',
        properties: {
          filePath: {
            type: 'string',
            description: 'Absolute path to the file',
          },
          chunkIndex: {
            type: 'number',
            description: 'Zero-based chunk index to read (default: 0)',
          },
          linesPerChunk: {
            type: 'number',
            description: 'Number of lines per chunk (optional, auto-detected if not provided)',
          },
          includeLineNumbers: {
            type: 'boolean',
            description: 'Include line numbers in output (default: false)',
          },
        },
        required: ['filePath'],
      },
    },
  • TypeScript interface `FileChunk` defining the structured output returned by the read_large_file_chunk tool handler, including content, line ranges, metadata, and byte info.
    export interface FileChunk {
      /** Chunk content */
      content: string;
      /** Starting line number (1-indexed) */
      startLine: number;
      /** Ending line number (1-indexed) */
      endLine: number;
      /** Total lines in file */
      totalLines: number;
      /** Chunk index */
      chunkIndex: number;
      /** Total number of chunks */
      totalChunks: number;
      /** File path */
      filePath: string;
      /** Byte offset start */
      byteOffset: number;
      /** Chunk size in bytes */
      byteSize: number;
    }
  • Helper function `getOptimalChunkSize` that determines the ideal number of lines per chunk based on file type and size, enabling intelligent chunking central to the tool's 'automatic' sizing feature.
    static getOptimalChunkSize(fileType: FileType, totalLines: number): number {
      const baseSizes: Record<FileType, number> = {
        [FileType.LOG]: 500,
        [FileType.CSV]: 1000,
        [FileType.JSON]: 100,
        [FileType.CODE]: 300,
        [FileType.TEXT]: 500,
        [FileType.MARKDOWN]: 200,
        [FileType.XML]: 200,
        [FileType.BINARY]: 1000,
        [FileType.UNKNOWN]: 500,
      };
    
      const baseSize = baseSizes[fileType] || 500;
    
      // Adjust for very large files
      if (totalLines > 100000) {
        return Math.min(baseSize * 2, 2000);
      }
    
      return baseSize;
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/willianpinho/large-file-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server