Skip to main content
Glama

smart_read

Extract specific code sections with line numbers from files. Saves context tokens by returning only relevant code instead of full file content.

Instructions

Surgical code extraction from files. Returns ONLY relevant code sections with line numbers — not analysis.

OUTPUT: Markdown with extracted code sections (verbatim, with line numbers), minimal annotations, file metadata, latency, token usage. Shows "Context saved" metric. Unlike analyze_file which returns prose analysis, smart_read returns actual code you can act on directly.

WHEN TO USE: When you need to read a file but only care about specific sections. Use instead of the Read tool when you have a specific intent like "find the auth logic", "show error handling", "extract the database schema". Especially valuable for large files (1000+ lines) where reading the whole file wastes context tokens. For general questions about a file, use analyze_file instead.

FAILURE MODES:

  • "File not found" → The path is wrong. Retry with the correct absolute path.

  • "Binary file detected" → Only text files are supported. Do not retry with this file.

  • "File too large" → The file exceeds 800K chars. Try a specific section.

  • "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.

  • "No relevant sections found" → Try a broader query, or use analyze_file for general analysis.

  • "Model query failed" → Try a different model or check provider status with list_models.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesAbsolute path to the file to read. The file is read server-side — it never enters your context window.
queryYesWhat to find or extract from the file. Be specific: 'error handling logic', 'the authentication middleware', 'database connection setup', 'how routes are registered'.
modelNoModel to use for extraction. Auto-picks a large-context model (Gemini 1M) if omitted.
max_response_tokensNoMaximum tokens in the response returned to you. If the extraction exceeds this, it will be distilled by a fast model to fit — preserving code sections while compressing annotations. Omit for no compression.
max_tokensNoMaximum tokens the extraction model generates (default: 2048, higher than analyze_file to accommodate complete code sections)
formatNoResponse format — 'brief' for token-efficient output, 'detailed' for full metadatadetailed
include_rawNoWhen true and compression is active, include the original uncompressed extraction for quality comparison. Use this to verify distillation preserved code sections.

Implementation Reference

  • Main handler function `smartRead` that orchestrates the tool: validates file existence, checks binary, reads file server-side, picks a large-context model, builds an extraction prompt, queries the model, optionally compresses the response, and formats the output.
    export async function smartRead(
      provider: Provider,
      input: SmartReadInput
    ): Promise<string> {
      const startTime = Date.now();
    
      // Step 1: Validate file exists
      if (!existsSync(input.file_path)) {
        return (
          `## Smart Read Failed\n\n` +
          `File not found: \`${input.file_path}\`\n\n` +
          `**Recovery:** Check the file path. Use an absolute path.`
        );
      }
    
      // Step 2: Check for binary files
      if (isBinaryFile(input.file_path)) {
        return (
          `## Smart Read Failed\n\n` +
          `Binary file detected: \`${input.file_path}\`\n\n` +
          `**Recovery:** Only text files are supported. Do not retry with this file.`
        );
      }
    
      // Step 3: Read the file server-side
      let fileContent: string;
      try {
        fileContent = readFileSync(input.file_path, "utf-8");
      } catch (err) {
        return (
          `## Smart Read Failed\n\n` +
          `Could not read file: \`${input.file_path}\`\n` +
          `Error: ${err instanceof Error ? err.message : String(err)}\n\n` +
          `**Recovery:** Check file permissions and ensure the path is correct.`
        );
      }
    
      // Step 4: Check file size
      if (fileContent.length > MAX_FILE_CHARS) {
        const sizeMB = (fileContent.length / 1_000_000).toFixed(1);
        return (
          `## Smart Read Failed\n\n` +
          `File too large: \`${input.file_path}\` (${sizeMB}M chars, limit: ${MAX_FILE_CHARS / 1_000_000}M)\n\n` +
          `**Recovery:** The file exceeds the 800K character limit. ` +
          `Try a specific section or ask the user to split the file.`
        );
      }
    
      const fileName = basename(input.file_path);
      const fileChars = fileContent.length;
      const fileLines = fileContent.split("\n").length;
    
      logger.info(
        `smart_read: ${fileName} (${fileLines} lines, ${fileChars} chars) — query: "${input.query}"`
      );
    
      // Step 5: Pick a large-context model
      const model = await pickLargeContextModel(provider, input.model);
      if (!model) {
        return (
          `## Smart Read Failed\n\n` +
          `No models available for file extraction.\n\n` +
          `**Recovery:** Start CLIProxyAPI or Ollama, then retry. ` +
          `Call list_models to verify provider status.`
        );
      }
    
      logger.info(`smart_read: using model ${model}`);
    
      // Step 6: Build extraction prompt
      const extractionPrompt = buildExtractionPrompt(
        input.file_path,
        fileContent,
        fileLines,
        input.query
      );
    
      // Step 7: Query the model
      let response: QueryResponse;
      try {
        response = await provider.query(model, extractionPrompt, {
          temperature: 0.1,
          max_tokens: input.max_tokens,
        });
      } catch (err) {
        return (
          `## Smart Read Failed\n\n` +
          `Model query failed: ${err instanceof Error ? err.message : String(err)}\n\n` +
          `**Recovery:** Try a different model or check provider status with list_models.`
        );
      }
    
      // Step 8: Compress if requested
      let compression: CompressionResult | undefined;
      if (input.max_response_tokens) {
        compression = await compressResponse(
          provider,
          response,
          input.max_response_tokens
        );
      }
    
      const totalMs = Date.now() - startTime;
    
      // Step 9: Format response
      return formatResponse(
        response,
        input.format ?? "detailed",
        compression,
        {
          fileName,
          filePath: input.file_path,
          fileLines,
          fileChars,
          totalMs,
          query: input.query,
        },
        input.include_raw ?? false
      );
    }
  • Zod schema `smartReadSchema` defining the input shape: file_path (string), query (string), model (optional string), max_response_tokens (optional number), max_tokens (optional number, default 2048), format (optional enum 'brief'|'detailed'), include_raw (optional boolean).
    export const smartReadSchema = z.object({
      file_path: z
        .string()
        .describe(
          "Absolute path to the file to read. The file is read server-side — it never enters your context window."
        ),
      query: z
        .string()
        .describe(
          "What to find or extract from the file. Be specific: 'error handling logic', " +
          "'the authentication middleware', 'database connection setup', 'how routes are registered'."
        ),
      model: z
        .string()
        .optional()
        .describe(
          "Model to use for extraction. Auto-picks a large-context model (Gemini 1M) if omitted."
        ),
      max_response_tokens: z
        .number()
        .int()
        .positive()
        .optional()
        .describe(
          "Maximum tokens in the response returned to you. If the extraction exceeds this, " +
          "it will be distilled by a fast model to fit — preserving code sections while " +
          "compressing annotations. Omit for no compression."
        ),
      max_tokens: z
        .number()
        .int()
        .positive()
        .optional()
        .default(2048)
        .describe(
          "Maximum tokens the extraction model generates (default: 2048, higher than " +
          "analyze_file to accommodate complete code sections)"
        ),
      format: z
        .enum(["brief", "detailed"])
        .optional()
        .default("detailed")
        .describe(
          "Response format — 'brief' for token-efficient output, 'detailed' for full metadata"
        ),
      include_raw: z
        .boolean()
        .optional()
        .default(false)
        .describe(
          "When true and compression is active, include the original uncompressed extraction " +
          "for quality comparison. Use this to verify distillation preserved code sections."
        ),
    });
  • Helper function `formatResponse` that formats the extraction output in either 'brief' or 'detailed' mode, including file metadata, context savings, compression info, and optional raw extraction.
    function formatResponse(
      response: QueryResponse,
      format: "brief" | "detailed",
      compression: CompressionResult | undefined,
      meta: FileMetadata,
      includeRaw: boolean
    ): string {
      const content = compression?.content ?? response.content;
    
      // Calculate context savings: tokens Claude would have burned reading the file
      const fileTokensEstimate = Math.ceil(meta.fileChars / 4);
      const responseTokens =
        compression?.compressedTokens ??
        response.usage?.completion_tokens ??
        Math.ceil(content.length / 4);
      const contextSaved = fileTokensEstimate - responseTokens;
    
      if (format === "brief") {
        const lines = [
          `**${meta.fileName}** → ${response.model} (${meta.totalMs}ms)`,
          "",
          content,
          "",
          `*Context saved: ~${contextSaved.toLocaleString()} tokens*`,
        ];
        if (compression?.compressed) {
          const saved = (compression.originalTokens ?? 0) - (compression.compressedTokens ?? 0);
          lines.push(
            `*Distilled by ${compression.compressorModel} — saved additional ${saved} tokens*`
          );
        }
        return lines.join("\n");
      }
    
      // Detailed format
      const lines = [
        `## Smart Read: ${meta.fileName}`,
        "",
        content,
        "",
        "---",
        `**File:** \`${meta.filePath}\` (${meta.fileLines} lines, ${meta.fileChars} chars)`,
        `**Query:** "${meta.query}"`,
        `**Model:** ${response.model} | **Latency:** ${response.latency_ms}ms | **Total:** ${meta.totalMs}ms`,
        `**Context saved:** ~${contextSaved.toLocaleString()} tokens (Claude got ${responseTokens.toLocaleString()} tokens instead of reading ${meta.fileChars.toLocaleString()} chars)`,
      ];
    
      if (response.usage) {
        lines.push(
          `**Tokens:** ${response.usage.prompt_tokens} in → ${response.usage.completion_tokens} out (${response.usage.total_tokens} total)`
        );
      }
    
      if (compression?.compressed) {
        const orig = compression.originalTokens ?? 0;
        const comp = compression.compressedTokens ?? 0;
        const saved = orig - comp;
        const pct = orig > 0 ? Math.round((saved / orig) * 100) : 0;
    
        lines.push(
          `**Distilled:** ${orig} → ${comp} tokens by ${compression.compressorModel} (${compression.compressorLatency}ms)`
        );
        lines.push(`**Saved:** ${saved} tokens (${pct}% smaller)`);
      }
    
      // Escape hatch: include raw uncompressed extraction
      if (includeRaw && compression?.compressed && compression.rawContent) {
        lines.push("");
        lines.push(
          `<details>\n<summary>Raw extraction (${compression.originalTokens ?? "?"} tokens, before distillation)</summary>\n\n${compression.rawContent}\n\n</details>`
        );
      }
    
      return lines.join("\n");
    }
  • Helper function `buildExtractionPrompt` that constructs the prompt sent to the LLM, instructing it to extract verbatim code sections relevant to the query with line numbers and minimal annotations.
    function buildExtractionPrompt(
      filePath: string,
      fileContent: string,
      fileLines: number,
      query: string
    ): string {
      return `You are a surgical code reader. Given a file and a search query, extract ONLY the relevant sections.
    
    File: ${filePath} (${fileLines} lines)
    
    \`\`\`
    ${fileContent}
    \`\`\`
    
    Query: ${query}
    
    RULES:
    - Extract verbatim code sections relevant to the query
    - For each section include: line range (e.g. "Lines 45-67"), the actual code with original indentation, and a one-line explanation of relevance
    - Include 2-3 lines of surrounding context for each section
    - Preserve code exactly as written — do not modify, summarize, or paraphrase code
    - If no relevant sections found, state clearly: "No relevant sections found for: ${query}"
    - Order sections by relevance (most relevant first)
    - Keep annotations minimal — this is extraction, not analysis
    - Output as markdown with fenced code blocks and line annotations
    
    OUTPUT FORMAT:
    
    ### Lines 45-67: Brief description of what this section does
    \`\`\`
    [exact code from the file]
    \`\`\`
    
    ### Lines 112-125: Brief description
    \`\`\`
    [exact code from the file]
    \`\`\``;
    }
  • src/server.ts:255-285 (registration)
    Registration of the 'smart_read' tool on the MCP server using `server.tool(...)`, importing the schema and handler from src/tools/smart-read.ts.
      // --- smart_read ---
      server.tool(
        "smart_read",
        `Surgical code extraction from files. Returns ONLY relevant code sections with line numbers — not analysis.
    
    OUTPUT: Markdown with extracted code sections (verbatim, with line numbers), minimal annotations, file metadata, latency, token usage. Shows "Context saved" metric. Unlike analyze_file which returns prose analysis, smart_read returns actual code you can act on directly.
    
    WHEN TO USE: When you need to read a file but only care about specific sections. Use instead of the Read tool when you have a specific intent like "find the auth logic", "show error handling", "extract the database schema". Especially valuable for large files (1000+ lines) where reading the whole file wastes context tokens. For general questions about a file, use analyze_file instead.
    
    FAILURE MODES:
    - "File not found" → The path is wrong. Retry with the correct absolute path.
    - "Binary file detected" → Only text files are supported. Do not retry with this file.
    - "File too large" → The file exceeds 800K chars. Try a specific section.
    - "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.
    - "No relevant sections found" → Try a broader query, or use analyze_file for general analysis.
    - "Model query failed" → Try a different model or check provider status with list_models.`,
        smartReadSchema.shape,
        async (input) => {
          logger.info(`smart_read: ${input.file_path}`);
          try {
            const result = await smartRead(provider, input);
            return { content: [{ type: "text" as const, text: result }] };
          } catch (err) {
            const message = err instanceof Error ? err.message : String(err);
            logger.error(`smart_read failed: ${message}`);
            return {
              content: [{ type: "text" as const, text: `Error: ${message}` }],
              isError: true,
            };
          }
        }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses behavior: it reads files server-side without entering context, lists output format (Markdown with line numbers), includes failure modes with explanations, and notes autoselection of large-context model. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections: purpose, output format, when to use, and failure modes. It is front-loaded with the core verb and differentiator, and every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, no output schema, multiple failure modes), the description is remarkably complete. It covers output format, parameter behaviors, failure modes, and usage context. All necessary information for correct invocation is present.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the description adds significant value beyond the schema. For example, it explains the auto-pick behavior for the 'model' parameter, distillation behavior for 'max_response_tokens', and the purpose of 'include_raw' for quality comparison.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Surgical code extraction from files' returning only relevant code sections with line numbers. It explicitly differentiates from the sibling tool 'analyze_file' by stating it returns actual code not prose analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The 'WHEN TO USE' section provides explicit guidance: use instead of the Read tool for specific intents, especially for large files. It also specifies when not to use: for general questions, use 'analyze_file' instead. This covers usage context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Pickle-Pixel/HydraMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server