Skip to main content
Glama

read_file

Read-only

Read file contents from local paths or URLs with pagination support for text, PDF, Excel, and image formats. Extract PDF text as markdown and view images as base64-encoded content.

Instructions

                    Read contents from files and URLs.
                    Read PDF files and extract content as markdown and images.
                    
                    Prefer this over 'execute_command' with cat/type for viewing files.
                    
                    Supports partial file reading with:
                    - 'offset' (start line, default: 0)
                      * Positive: Start from line N (0-based indexing)
                      * Negative: Read last N lines from end (tail behavior)
                    - 'length' (max lines to read, default: configurable via 'fileReadLineLimit' setting, initially 1000)
                      * Used with positive offsets for range reading
                      * Ignored when offset is negative (reads all requested tail lines)
                    
                    Examples:
                    - offset: 0, length: 10     → First 10 lines
                    - offset: 100, length: 5    → Lines 100-104
                    - offset: -20               → Last 20 lines  
                    - offset: -5, length: 10    → Last 5 lines (length ignored)
                    
                    Performance optimizations:
                    - Large files with negative offsets use reverse reading for efficiency
                    - Large files with deep positive offsets use byte estimation
                    - Small files use fast readline streaming
                    
                    When reading from the file system, only works within allowed directories.
                    Can fetch content from URLs when isUrl parameter is set to true
                    (URLs are always read in full regardless of offset/length).
                    
                    FORMAT HANDLING (by extension):
                    - Text: Uses offset/length for line-based pagination
                    - Excel (.xlsx, .xls, .xlsm): Returns JSON 2D array
                      * sheet: "Sheet1" (name) or "0" (index as string, 0-based)
                      * range: ALWAYS use FROM:TO format (e.g., "A1:D100", "C1:C1", "B2:B50")
                      * offset/length work as row pagination (optional fallback)
                    - Images (PNG, JPEG, GIF, WebP): Base64 encoded viewable content
                    - PDF: Extracts text content as markdown with page structure
                      * offset/length work as page pagination (0-based)
                      * Includes embedded images when available

                    IMPORTANT: Always use absolute paths for reliability. Paths are automatically normalized regardless of slash direction. Relative paths may fail as they depend on the current working directory. Tilde paths (~/...) might not work in all contexts. Unless the user explicitly asks for relative paths, use absolute paths.
                    This command can be referenced as "DC: ..." or "use Desktop Commander to ..." in your instructions.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathYes
isUrlNo
offsetNo
lengthNo
sheetNo
rangeNo
optionsNo

Implementation Reference

  • The primary handler function for the 'read_file' MCP tool. Validates input using ReadFileArgsSchema, applies configuration limits, constructs ReadOptions, calls the readFile helper, and formats output for different file types (PDF, images, text) with appropriate MCP content structures.
    export async function handleReadFile(args: unknown): Promise<ServerResult> {
        const HANDLER_TIMEOUT = 60000; // 60 seconds total operation timeout
        // Add input validation
        if (args === null || args === undefined) {
            return createErrorResponse('No arguments provided for read_file command');
        }
        const readFileOperation = async () => {
            const parsed = ReadFileArgsSchema.parse(args);
    
            // Get the configuration for file read limits
            const config = await configManager.getConfig();
            if (!config) {
                return createErrorResponse('Configuration not available');
            }
    
            const defaultLimit = config.fileReadLineLimit ?? 1000;
    
            // Convert sheet parameter: numeric strings become numbers for Excel index access
            let sheetParam: string | number | undefined = parsed.sheet;
            if (parsed.sheet !== undefined && /^\d+$/.test(parsed.sheet)) {
                sheetParam = parseInt(parsed.sheet, 10);
            }
    
            const options: ReadOptions = {
                isUrl: parsed.isUrl,
                offset: parsed.offset ?? 0,
                length: parsed.length ?? defaultLimit,
                sheet: sheetParam,
                range: parsed.range
            };
            const fileResult = await readFile(parsed.path, options);
    
            // Handle PDF files
            if (fileResult.metadata?.isPdf) {
                const meta = fileResult.metadata;
                const author = meta?.author ? `, Author: ${meta?.author}` : "";
                const title = meta?.title ? `, Title: ${meta?.title}` : "";
    
                const pdfContent = fileResult.metadata?.pages?.flatMap((p: any) => [
                    ...(p.images?.map((image: any) => ({
                        type: "image",
                        data: image.data,
                        mimeType: image.mimeType
                    })) ?? []),
                    {
                        type: "text",
                        text: `<!-- Page: ${p.pageNumber} -->\n${p.text}`,
                    },
                ]) ?? [];
    
                return {
                    content: [
                        {
                            type: "text",
                            text: `PDF file: ${parsed.path}${author}${title} (${meta?.totalPages} pages) \n`
                        },
                        ...pdfContent
                    ]
                };
            }
    
            // Handle image files
            if (fileResult.metadata?.isImage) {
                // For image files, return as an image content type
                // Content should already be base64-encoded string from handler
                const imageData = typeof fileResult.content === 'string'
                    ? fileResult.content
                    : fileResult.content.toString('base64');
                return {
                    content: [
                        {
                            type: "text",
                            text: `Image file: ${parsed.path} (${fileResult.mimeType})\n`
                        },
                        {
                            type: "image",
                            data: imageData,
                            mimeType: fileResult.mimeType
                        }
                    ],
                };
            } else {
                // For all other files, return as text
                const textContent = typeof fileResult.content === 'string'
                    ? fileResult.content
                    : fileResult.content.toString('utf8');
                return {
                    content: [{ type: "text", text: textContent }],
                };
            }
        };
    
        // Execute with timeout at the handler level
        const result = await withTimeout(
            readFileOperation(),
            HANDLER_TIMEOUT,
            'Read file handler operation',
            null
        );
        if (result == null) {
            // Handles the impossible case where withTimeout resolves to null instead of throwing
            throw new Error('Failed to read the file');
        }
        return result;
    }
  • src/server.ts:242-291 (registration)
    Registers the 'read_file' tool in the MCP server's list of available tools, including detailed description, input schema derived from ReadFileArgsSchema, and annotations.
        name: "read_file",
        description: `
                Read contents from files and URLs.
                Read PDF files and extract content as markdown and images.
                
                Prefer this over 'execute_command' with cat/type for viewing files.
                
                Supports partial file reading with:
                - 'offset' (start line, default: 0)
                  * Positive: Start from line N (0-based indexing)
                  * Negative: Read last N lines from end (tail behavior)
                - 'length' (max lines to read, default: configurable via 'fileReadLineLimit' setting, initially 1000)
                  * Used with positive offsets for range reading
                  * Ignored when offset is negative (reads all requested tail lines)
                
                Examples:
                - offset: 0, length: 10     → First 10 lines
                - offset: 100, length: 5    → Lines 100-104
                - offset: -20               → Last 20 lines  
                - offset: -5, length: 10    → Last 5 lines (length ignored)
                
                Performance optimizations:
                - Large files with negative offsets use reverse reading for efficiency
                - Large files with deep positive offsets use byte estimation
                - Small files use fast readline streaming
                
                When reading from the file system, only works within allowed directories.
                Can fetch content from URLs when isUrl parameter is set to true
                (URLs are always read in full regardless of offset/length).
                
                FORMAT HANDLING (by extension):
                - Text: Uses offset/length for line-based pagination
                - Excel (.xlsx, .xls, .xlsm): Returns JSON 2D array
                  * sheet: "Sheet1" (name) or "0" (index as string, 0-based)
                  * range: ALWAYS use FROM:TO format (e.g., "A1:D100", "C1:C1", "B2:B50")
                  * offset/length work as row pagination (optional fallback)
                - Images (PNG, JPEG, GIF, WebP): Base64 encoded viewable content
                - PDF: Extracts text content as markdown with page structure
                  * offset/length work as page pagination (0-based)
                  * Includes embedded images when available
    
                ${PATH_GUIDANCE}
                ${CMD_PREFIX_DESCRIPTION}`,
        inputSchema: zodToJsonSchema(ReadFileArgsSchema),
        annotations: {
            title: "Read File or URL",
            readOnlyHint: true,
            openWorldHint: true,
        },
    },
  • Zod schema defining the input parameters for the read_file tool: path (required), isUrl, offset, length, sheet, range, and options.
    export const ReadFileArgsSchema = z.object({
      path: z.string(),
      isUrl: z.boolean().optional().default(false),
      offset: z.number().optional().default(0),
      length: z.number().optional().default(1000),
      sheet: z.string().optional(),  // String only for MCP client compatibility (Cursor doesn't support union types in JSON Schema)
      range: z.string().optional(),
      options: z.record(z.any()).optional()
    });
  • Core helper function dispatched by the handler. Determines if the path is a URL or local file and calls the appropriate reader (readFileFromUrl or readFileFromDisk), passing read options.
    export async function readFile(
        filePath: string,
        options?: ReadOptions
    ): Promise<FileResult> {
        const { isUrl, offset, length, sheet, range } = options ?? {};
        return isUrl
            ? readFileFromUrl(filePath)
            : readFileFromDisk(filePath, { offset, length, sheet, range });
    }
  • Supporting helper that performs the actual disk read: path validation, file stats, handler selection, content reading with options, timeout protection, and content formatting.
    export async function readFileFromDisk(
        filePath: string,
        options?: ReadOptions
    ): Promise<FileResult> {
        const { offset = 0, sheet, range } = options ?? {};
        let { length } = options ?? {};
    
        // Add validation for required parameters
        if (!filePath || typeof filePath !== 'string') {
            throw new Error('Invalid file path provided');
        }
    
        // Get default length from config if not provided
        if (length === undefined) {
            length = await getDefaultReadLength();
        }
    
        const validPath = await validatePath(filePath);
    
        // Get file extension for telemetry
        const fileExtension = getFileExtension(validPath);
    
        // Check file size before attempting to read
        try {
            const stats = await fs.stat(validPath);
    
            // Capture file extension in telemetry without capturing the file path
            capture('server_read_file', {
                fileExtension: fileExtension,
                offset: offset,
                length: length,
                fileSize: stats.size
            });
        } catch (error) {
            console.error('error catch ' + error);
            const errorMessage = error instanceof Error ? error.message : String(error);
            capture('server_read_file_error', { error: errorMessage, fileExtension: fileExtension });
            // If we can't stat the file, continue anyway and let the read operation handle errors
        }
    
        // Use withTimeout to handle potential hangs
        const readOperation = async () => {
            // Get appropriate handler for this file type (async - includes binary detection)
            const handler = await getFileHandler(validPath);
    
            // Use handler to read the file
            const result = await handler.read(validPath, {
                offset,
                length,
                sheet,
                range,
                includeStatusMessage: true
            });
    
            // Return with content as string
            // For images: content is already base64-encoded string from handler
            // For text: content may be string or Buffer, convert to UTF-8 string
            let content: string;
            if (typeof result.content === 'string') {
                content = result.content;
            } else if (result.metadata?.isImage) {
                // Image buffer should be base64 encoded, not UTF-8 converted
                content = result.content.toString('base64');
            } else {
                content = result.content.toString('utf8');
            }
    
            return {
                content,
                mimeType: result.mimeType,
                metadata: result.metadata
            };
        };
    
        // Execute with timeout
        const result = await withTimeout(
            readOperation(),
            FILE_OPERATION_TIMEOUTS.FILE_READ,
            `Read file operation for ${filePath}`,
            null
        );
    
        if (result == null) {
            // Handles the impossible case where withTimeout resolves to null instead of throwing
            throw new Error('Failed to read the file');
        }
    
        return result;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=true and openWorldHint=true, but the description adds substantial behavioral context: performance optimizations for large files, directory restrictions for file system reads, URL handling differences, format-specific behaviors (Excel returns JSON, images base64, PDF as markdown), and path normalization details. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but lengthy with some redundancy (e.g., multiple mentions of offset/length behavior). It's front-loaded with core purpose but includes extensive details that could be more streamlined. Every sentence adds value, but structure could be tighter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters with no schema descriptions, no output schema, and annotations covering only safety aspects, the description provides extensive context: parameter usage, format handling, performance, restrictions, and examples. It's nearly complete but could briefly mention error cases or response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for 7 parameters, the description fully compensates by explaining offset behavior (positive/negative, tail), length defaults and interactions, isUrl implications, sheet and range usage for Excel, and path requirements. It provides examples and clarifies parameter semantics beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool reads contents from files and URLs, specifically mentioning PDF extraction and format handling. It distinguishes from sibling 'execute_command' by advising preference over it for viewing files, and from 'read_multiple_files' by being single-file focused.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is provided: 'Prefer this over execute_command with cat/type for viewing files' and 'Always use absolute paths for reliability'. It also clarifies when offset/length parameters apply vs. are ignored (e.g., URLs always read in full, length ignored with negative offsets).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wonderwhy-er/ClaudeComputerCommander'

If you have feedback or need assistance with the MCP directory API, please join our Discord server