smart_read
Extract specific code sections with line numbers from files. Saves context tokens by returning only relevant code instead of full file content.
Instructions
Surgical code extraction from files. Returns ONLY relevant code sections with line numbers — not analysis.
OUTPUT: Markdown with extracted code sections (verbatim, with line numbers), minimal annotations, file metadata, latency, token usage. Shows "Context saved" metric. Unlike analyze_file which returns prose analysis, smart_read returns actual code you can act on directly.
WHEN TO USE: When you need to read a file but only care about specific sections. Use instead of the Read tool when you have a specific intent like "find the auth logic", "show error handling", "extract the database schema". Especially valuable for large files (1000+ lines) where reading the whole file wastes context tokens. For general questions about a file, use analyze_file instead.
FAILURE MODES:
"File not found" → The path is wrong. Retry with the correct absolute path.
"Binary file detected" → Only text files are supported. Do not retry with this file.
"File too large" → The file exceeds 800K chars. Try a specific section.
"No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.
"No relevant sections found" → Try a broader query, or use analyze_file for general analysis.
"Model query failed" → Try a different model or check provider status with list_models.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute path to the file to read. The file is read server-side — it never enters your context window. | |
| query | Yes | What to find or extract from the file. Be specific: 'error handling logic', 'the authentication middleware', 'database connection setup', 'how routes are registered'. | |
| model | No | Model to use for extraction. Auto-picks a large-context model (Gemini 1M) if omitted. | |
| max_response_tokens | No | Maximum tokens in the response returned to you. If the extraction exceeds this, it will be distilled by a fast model to fit — preserving code sections while compressing annotations. Omit for no compression. | |
| max_tokens | No | Maximum tokens the extraction model generates (default: 2048, higher than analyze_file to accommodate complete code sections) | |
| format | No | Response format — 'brief' for token-efficient output, 'detailed' for full metadata | detailed |
| include_raw | No | When true and compression is active, include the original uncompressed extraction for quality comparison. Use this to verify distillation preserved code sections. |
Implementation Reference
- src/tools/smart-read.ts:167-286 (handler)Main handler function `smartRead` that orchestrates the tool: validates file existence, checks binary, reads file server-side, picks a large-context model, builds an extraction prompt, queries the model, optionally compresses the response, and formats the output.
export async function smartRead( provider: Provider, input: SmartReadInput ): Promise<string> { const startTime = Date.now(); // Step 1: Validate file exists if (!existsSync(input.file_path)) { return ( `## Smart Read Failed\n\n` + `File not found: \`${input.file_path}\`\n\n` + `**Recovery:** Check the file path. Use an absolute path.` ); } // Step 2: Check for binary files if (isBinaryFile(input.file_path)) { return ( `## Smart Read Failed\n\n` + `Binary file detected: \`${input.file_path}\`\n\n` + `**Recovery:** Only text files are supported. Do not retry with this file.` ); } // Step 3: Read the file server-side let fileContent: string; try { fileContent = readFileSync(input.file_path, "utf-8"); } catch (err) { return ( `## Smart Read Failed\n\n` + `Could not read file: \`${input.file_path}\`\n` + `Error: ${err instanceof Error ? err.message : String(err)}\n\n` + `**Recovery:** Check file permissions and ensure the path is correct.` ); } // Step 4: Check file size if (fileContent.length > MAX_FILE_CHARS) { const sizeMB = (fileContent.length / 1_000_000).toFixed(1); return ( `## Smart Read Failed\n\n` + `File too large: \`${input.file_path}\` (${sizeMB}M chars, limit: ${MAX_FILE_CHARS / 1_000_000}M)\n\n` + `**Recovery:** The file exceeds the 800K character limit. ` + `Try a specific section or ask the user to split the file.` ); } const fileName = basename(input.file_path); const fileChars = fileContent.length; const fileLines = fileContent.split("\n").length; logger.info( `smart_read: ${fileName} (${fileLines} lines, ${fileChars} chars) — query: "${input.query}"` ); // Step 5: Pick a large-context model const model = await pickLargeContextModel(provider, input.model); if (!model) { return ( `## Smart Read Failed\n\n` + `No models available for file extraction.\n\n` + `**Recovery:** Start CLIProxyAPI or Ollama, then retry. ` + `Call list_models to verify provider status.` ); } logger.info(`smart_read: using model ${model}`); // Step 6: Build extraction prompt const extractionPrompt = buildExtractionPrompt( input.file_path, fileContent, fileLines, input.query ); // Step 7: Query the model let response: QueryResponse; try { response = await provider.query(model, extractionPrompt, { temperature: 0.1, max_tokens: input.max_tokens, }); } catch (err) { return ( `## Smart Read Failed\n\n` + `Model query failed: ${err instanceof Error ? err.message : String(err)}\n\n` + `**Recovery:** Try a different model or check provider status with list_models.` ); } // Step 8: Compress if requested let compression: CompressionResult | undefined; if (input.max_response_tokens) { compression = await compressResponse( provider, response, input.max_response_tokens ); } const totalMs = Date.now() - startTime; // Step 9: Format response return formatResponse( response, input.format ?? "detailed", compression, { fileName, filePath: input.file_path, fileLines, fileChars, totalMs, query: input.query, }, input.include_raw ?? false ); } - src/tools/smart-read.ts:33-86 (schema)Zod schema `smartReadSchema` defining the input shape: file_path (string), query (string), model (optional string), max_response_tokens (optional number), max_tokens (optional number, default 2048), format (optional enum 'brief'|'detailed'), include_raw (optional boolean).
export const smartReadSchema = z.object({ file_path: z .string() .describe( "Absolute path to the file to read. The file is read server-side — it never enters your context window." ), query: z .string() .describe( "What to find or extract from the file. Be specific: 'error handling logic', " + "'the authentication middleware', 'database connection setup', 'how routes are registered'." ), model: z .string() .optional() .describe( "Model to use for extraction. Auto-picks a large-context model (Gemini 1M) if omitted." ), max_response_tokens: z .number() .int() .positive() .optional() .describe( "Maximum tokens in the response returned to you. If the extraction exceeds this, " + "it will be distilled by a fast model to fit — preserving code sections while " + "compressing annotations. Omit for no compression." ), max_tokens: z .number() .int() .positive() .optional() .default(2048) .describe( "Maximum tokens the extraction model generates (default: 2048, higher than " + "analyze_file to accommodate complete code sections)" ), format: z .enum(["brief", "detailed"]) .optional() .default("detailed") .describe( "Response format — 'brief' for token-efficient output, 'detailed' for full metadata" ), include_raw: z .boolean() .optional() .default(false) .describe( "When true and compression is active, include the original uncompressed extraction " + "for quality comparison. Use this to verify distillation preserved code sections." ), }); - src/tools/smart-read.ts:301-375 (handler)Helper function `formatResponse` that formats the extraction output in either 'brief' or 'detailed' mode, including file metadata, context savings, compression info, and optional raw extraction.
function formatResponse( response: QueryResponse, format: "brief" | "detailed", compression: CompressionResult | undefined, meta: FileMetadata, includeRaw: boolean ): string { const content = compression?.content ?? response.content; // Calculate context savings: tokens Claude would have burned reading the file const fileTokensEstimate = Math.ceil(meta.fileChars / 4); const responseTokens = compression?.compressedTokens ?? response.usage?.completion_tokens ?? Math.ceil(content.length / 4); const contextSaved = fileTokensEstimate - responseTokens; if (format === "brief") { const lines = [ `**${meta.fileName}** → ${response.model} (${meta.totalMs}ms)`, "", content, "", `*Context saved: ~${contextSaved.toLocaleString()} tokens*`, ]; if (compression?.compressed) { const saved = (compression.originalTokens ?? 0) - (compression.compressedTokens ?? 0); lines.push( `*Distilled by ${compression.compressorModel} — saved additional ${saved} tokens*` ); } return lines.join("\n"); } // Detailed format const lines = [ `## Smart Read: ${meta.fileName}`, "", content, "", "---", `**File:** \`${meta.filePath}\` (${meta.fileLines} lines, ${meta.fileChars} chars)`, `**Query:** "${meta.query}"`, `**Model:** ${response.model} | **Latency:** ${response.latency_ms}ms | **Total:** ${meta.totalMs}ms`, `**Context saved:** ~${contextSaved.toLocaleString()} tokens (Claude got ${responseTokens.toLocaleString()} tokens instead of reading ${meta.fileChars.toLocaleString()} chars)`, ]; if (response.usage) { lines.push( `**Tokens:** ${response.usage.prompt_tokens} in → ${response.usage.completion_tokens} out (${response.usage.total_tokens} total)` ); } if (compression?.compressed) { const orig = compression.originalTokens ?? 0; const comp = compression.compressedTokens ?? 0; const saved = orig - comp; const pct = orig > 0 ? Math.round((saved / orig) * 100) : 0; lines.push( `**Distilled:** ${orig} → ${comp} tokens by ${compression.compressorModel} (${compression.compressorLatency}ms)` ); lines.push(`**Saved:** ${saved} tokens (${pct}% smaller)`); } // Escape hatch: include raw uncompressed extraction if (includeRaw && compression?.compressed && compression.rawContent) { lines.push(""); lines.push( `<details>\n<summary>Raw extraction (${compression.originalTokens ?? "?"} tokens, before distillation)</summary>\n\n${compression.rawContent}\n\n</details>` ); } return lines.join("\n"); } - src/tools/smart-read.ts:124-161 (handler)Helper function `buildExtractionPrompt` that constructs the prompt sent to the LLM, instructing it to extract verbatim code sections relevant to the query with line numbers and minimal annotations.
function buildExtractionPrompt( filePath: string, fileContent: string, fileLines: number, query: string ): string { return `You are a surgical code reader. Given a file and a search query, extract ONLY the relevant sections. File: ${filePath} (${fileLines} lines) \`\`\` ${fileContent} \`\`\` Query: ${query} RULES: - Extract verbatim code sections relevant to the query - For each section include: line range (e.g. "Lines 45-67"), the actual code with original indentation, and a one-line explanation of relevance - Include 2-3 lines of surrounding context for each section - Preserve code exactly as written — do not modify, summarize, or paraphrase code - If no relevant sections found, state clearly: "No relevant sections found for: ${query}" - Order sections by relevance (most relevant first) - Keep annotations minimal — this is extraction, not analysis - Output as markdown with fenced code blocks and line annotations OUTPUT FORMAT: ### Lines 45-67: Brief description of what this section does \`\`\` [exact code from the file] \`\`\` ### Lines 112-125: Brief description \`\`\` [exact code from the file] \`\`\``; } - src/server.ts:255-285 (registration)Registration of the 'smart_read' tool on the MCP server using `server.tool(...)`, importing the schema and handler from src/tools/smart-read.ts.
// --- smart_read --- server.tool( "smart_read", `Surgical code extraction from files. Returns ONLY relevant code sections with line numbers — not analysis. OUTPUT: Markdown with extracted code sections (verbatim, with line numbers), minimal annotations, file metadata, latency, token usage. Shows "Context saved" metric. Unlike analyze_file which returns prose analysis, smart_read returns actual code you can act on directly. WHEN TO USE: When you need to read a file but only care about specific sections. Use instead of the Read tool when you have a specific intent like "find the auth logic", "show error handling", "extract the database schema". Especially valuable for large files (1000+ lines) where reading the whole file wastes context tokens. For general questions about a file, use analyze_file instead. FAILURE MODES: - "File not found" → The path is wrong. Retry with the correct absolute path. - "Binary file detected" → Only text files are supported. Do not retry with this file. - "File too large" → The file exceeds 800K chars. Try a specific section. - "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider. - "No relevant sections found" → Try a broader query, or use analyze_file for general analysis. - "Model query failed" → Try a different model or check provider status with list_models.`, smartReadSchema.shape, async (input) => { logger.info(`smart_read: ${input.file_path}`); try { const result = await smartRead(provider, input); return { content: [{ type: "text" as const, text: result }] }; } catch (err) { const message = err instanceof Error ? err.message : String(err); logger.error(`smart_read failed: ${message}`); return { content: [{ type: "text" as const, text: `Error: ${message}` }], isError: true, }; } }