Skip to main content
Glama

read_file

Read file contents from local paths or URLs with pagination support for text, PDF, Excel, and image formats. Extract PDF text as markdown and view images as base64-encoded content.

Instructions

Read contents from files and URLs. Read PDF files and extract content as markdown and images. Prefer this over 'execute_command' with cat/type for viewing files. Supports partial file reading with: - 'offset' (start line, default: 0) * Positive: Start from line N (0-based indexing) * Negative: Read last N lines from end (tail behavior) - 'length' (max lines to read, default: configurable via 'fileReadLineLimit' setting, initially 1000) * Used with positive offsets for range reading * Ignored when offset is negative (reads all requested tail lines) Examples: - offset: 0, length: 10 → First 10 lines - offset: 100, length: 5 → Lines 100-104 - offset: -20 → Last 20 lines - offset: -5, length: 10 → Last 5 lines (length ignored) Performance optimizations: - Large files with negative offsets use reverse reading for efficiency - Large files with deep positive offsets use byte estimation - Small files use fast readline streaming When reading from the file system, only works within allowed directories. Can fetch content from URLs when isUrl parameter is set to true (URLs are always read in full regardless of offset/length). FORMAT HANDLING (by extension): - Text: Uses offset/length for line-based pagination - Excel (.xlsx, .xls, .xlsm): Returns JSON 2D array * sheet: "Sheet1" (name) or "0" (index as string, 0-based) * range: ALWAYS use FROM:TO format (e.g., "A1:D100", "C1:C1", "B2:B50") * offset/length work as row pagination (optional fallback) - Images (PNG, JPEG, GIF, WebP): Base64 encoded viewable content - PDF: Extracts text content as markdown with page structure * offset/length work as page pagination (0-based) * Includes embedded images when available IMPORTANT: Always use absolute paths for reliability. Paths are automatically normalized regardless of slash direction. Relative paths may fail as they depend on the current working directory. Tilde paths (~/...) might not work in all contexts. Unless the user explicitly asks for relative paths, use absolute paths. This command can be referenced as "DC: ..." or "use Desktop Commander to ..." in your instructions.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathYes
isUrlNo
offsetNo
lengthNo
sheetNo
rangeNo
optionsNo

Implementation Reference

  • The primary handler function for the 'read_file' MCP tool. Validates input using ReadFileArgsSchema, applies configuration limits, constructs ReadOptions, calls the readFile helper, and formats output for different file types (PDF, images, text) with appropriate MCP content structures.
    export async function handleReadFile(args: unknown): Promise<ServerResult> { const HANDLER_TIMEOUT = 60000; // 60 seconds total operation timeout // Add input validation if (args === null || args === undefined) { return createErrorResponse('No arguments provided for read_file command'); } const readFileOperation = async () => { const parsed = ReadFileArgsSchema.parse(args); // Get the configuration for file read limits const config = await configManager.getConfig(); if (!config) { return createErrorResponse('Configuration not available'); } const defaultLimit = config.fileReadLineLimit ?? 1000; // Convert sheet parameter: numeric strings become numbers for Excel index access let sheetParam: string | number | undefined = parsed.sheet; if (parsed.sheet !== undefined && /^\d+$/.test(parsed.sheet)) { sheetParam = parseInt(parsed.sheet, 10); } const options: ReadOptions = { isUrl: parsed.isUrl, offset: parsed.offset ?? 0, length: parsed.length ?? defaultLimit, sheet: sheetParam, range: parsed.range }; const fileResult = await readFile(parsed.path, options); // Handle PDF files if (fileResult.metadata?.isPdf) { const meta = fileResult.metadata; const author = meta?.author ? `, Author: ${meta?.author}` : ""; const title = meta?.title ? `, Title: ${meta?.title}` : ""; const pdfContent = fileResult.metadata?.pages?.flatMap((p: any) => [ ...(p.images?.map((image: any) => ({ type: "image", data: image.data, mimeType: image.mimeType })) ?? []), { type: "text", text: `<!-- Page: ${p.pageNumber} -->\n${p.text}`, }, ]) ?? []; return { content: [ { type: "text", text: `PDF file: ${parsed.path}${author}${title} (${meta?.totalPages} pages) \n` }, ...pdfContent ] }; } // Handle image files if (fileResult.metadata?.isImage) { // For image files, return as an image content type // Content should already be base64-encoded string from handler const imageData = typeof fileResult.content === 'string' ? fileResult.content : fileResult.content.toString('base64'); return { content: [ { type: "text", text: `Image file: ${parsed.path} (${fileResult.mimeType})\n` }, { type: "image", data: imageData, mimeType: fileResult.mimeType } ], }; } else { // For all other files, return as text const textContent = typeof fileResult.content === 'string' ? fileResult.content : fileResult.content.toString('utf8'); return { content: [{ type: "text", text: textContent }], }; } }; // Execute with timeout at the handler level const result = await withTimeout( readFileOperation(), HANDLER_TIMEOUT, 'Read file handler operation', null ); if (result == null) { // Handles the impossible case where withTimeout resolves to null instead of throwing throw new Error('Failed to read the file'); } return result; }
  • src/server.ts:242-291 (registration)
    Registers the 'read_file' tool in the MCP server's list of available tools, including detailed description, input schema derived from ReadFileArgsSchema, and annotations.
    name: "read_file", description: ` Read contents from files and URLs. Read PDF files and extract content as markdown and images. Prefer this over 'execute_command' with cat/type for viewing files. Supports partial file reading with: - 'offset' (start line, default: 0) * Positive: Start from line N (0-based indexing) * Negative: Read last N lines from end (tail behavior) - 'length' (max lines to read, default: configurable via 'fileReadLineLimit' setting, initially 1000) * Used with positive offsets for range reading * Ignored when offset is negative (reads all requested tail lines) Examples: - offset: 0, length: 10 → First 10 lines - offset: 100, length: 5 → Lines 100-104 - offset: -20 → Last 20 lines - offset: -5, length: 10 → Last 5 lines (length ignored) Performance optimizations: - Large files with negative offsets use reverse reading for efficiency - Large files with deep positive offsets use byte estimation - Small files use fast readline streaming When reading from the file system, only works within allowed directories. Can fetch content from URLs when isUrl parameter is set to true (URLs are always read in full regardless of offset/length). FORMAT HANDLING (by extension): - Text: Uses offset/length for line-based pagination - Excel (.xlsx, .xls, .xlsm): Returns JSON 2D array * sheet: "Sheet1" (name) or "0" (index as string, 0-based) * range: ALWAYS use FROM:TO format (e.g., "A1:D100", "C1:C1", "B2:B50") * offset/length work as row pagination (optional fallback) - Images (PNG, JPEG, GIF, WebP): Base64 encoded viewable content - PDF: Extracts text content as markdown with page structure * offset/length work as page pagination (0-based) * Includes embedded images when available ${PATH_GUIDANCE} ${CMD_PREFIX_DESCRIPTION}`, inputSchema: zodToJsonSchema(ReadFileArgsSchema), annotations: { title: "Read File or URL", readOnlyHint: true, openWorldHint: true, }, },
  • Zod schema defining the input parameters for the read_file tool: path (required), isUrl, offset, length, sheet, range, and options.
    export const ReadFileArgsSchema = z.object({ path: z.string(), isUrl: z.boolean().optional().default(false), offset: z.number().optional().default(0), length: z.number().optional().default(1000), sheet: z.string().optional(), // String only for MCP client compatibility (Cursor doesn't support union types in JSON Schema) range: z.string().optional(), options: z.record(z.any()).optional() });
  • Core helper function dispatched by the handler. Determines if the path is a URL or local file and calls the appropriate reader (readFileFromUrl or readFileFromDisk), passing read options.
    export async function readFile( filePath: string, options?: ReadOptions ): Promise<FileResult> { const { isUrl, offset, length, sheet, range } = options ?? {}; return isUrl ? readFileFromUrl(filePath) : readFileFromDisk(filePath, { offset, length, sheet, range }); }
  • Supporting helper that performs the actual disk read: path validation, file stats, handler selection, content reading with options, timeout protection, and content formatting.
    export async function readFileFromDisk( filePath: string, options?: ReadOptions ): Promise<FileResult> { const { offset = 0, sheet, range } = options ?? {}; let { length } = options ?? {}; // Add validation for required parameters if (!filePath || typeof filePath !== 'string') { throw new Error('Invalid file path provided'); } // Get default length from config if not provided if (length === undefined) { length = await getDefaultReadLength(); } const validPath = await validatePath(filePath); // Get file extension for telemetry const fileExtension = getFileExtension(validPath); // Check file size before attempting to read try { const stats = await fs.stat(validPath); // Capture file extension in telemetry without capturing the file path capture('server_read_file', { fileExtension: fileExtension, offset: offset, length: length, fileSize: stats.size }); } catch (error) { console.error('error catch ' + error); const errorMessage = error instanceof Error ? error.message : String(error); capture('server_read_file_error', { error: errorMessage, fileExtension: fileExtension }); // If we can't stat the file, continue anyway and let the read operation handle errors } // Use withTimeout to handle potential hangs const readOperation = async () => { // Get appropriate handler for this file type (async - includes binary detection) const handler = await getFileHandler(validPath); // Use handler to read the file const result = await handler.read(validPath, { offset, length, sheet, range, includeStatusMessage: true }); // Return with content as string // For images: content is already base64-encoded string from handler // For text: content may be string or Buffer, convert to UTF-8 string let content: string; if (typeof result.content === 'string') { content = result.content; } else if (result.metadata?.isImage) { // Image buffer should be base64 encoded, not UTF-8 converted content = result.content.toString('base64'); } else { content = result.content.toString('utf8'); } return { content, mimeType: result.mimeType, metadata: result.metadata }; }; // Execute with timeout const result = await withTimeout( readOperation(), FILE_OPERATION_TIMEOUTS.FILE_READ, `Read file operation for ${filePath}`, null ); if (result == null) { // Handles the impossible case where withTimeout resolves to null instead of throwing throw new Error('Failed to read the file'); } return result; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wonderwhy-er/ClaudeComputerCommander'

If you have feedback or need assistance with the MCP directory API, please join our Discord server