doc_read
Read and extract content from documents using a specified file path with TOOL4LM's MCP server functionality for efficient information retrieval.
Instructions
Alias of doc.read
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes |
Implementation Reference
- src/tools/doc.ts:101-110 (handler)Core handler function for reading document files (text or PDF) from the sandbox directory. Resolves path, checks sandbox boundary, reads file, parses PDF if needed, extracts text.export async function docRead(p: string) { const full = path.resolve(p); if (!full.startsWith(CONFIG.sandboxDir)) throw new Error('Access outside sandbox is not allowed'); const buf = await fs.readFile(full); let text = ''; if (/\.pdf$/i.test(full)) { try { const parsed = await pdfParseLazy(buf as unknown as Buffer); text = parsed.text || ''; } catch { text = ''; } } else { text = buf.toString('utf-8'); } return { path: full, text }; }
- src/server.ts:133-139 (registration)Registration of the 'doc_read' MCP tool. Thin wrapper around docRead handler, formats response for MCP protocol.server.tool('doc_read', 'Alias of doc.read', docReadShape, OPEN, async ({ path }) => { const res = await docRead(path); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );
- src/server.ts:125-125 (schema)Zod schema for doc_read input: requires 'path' string.const docReadShape = { path: z.string() };
- src/tools/doc.ts:10-21 (helper)Lazy-loading PDF parser helper used by docRead for PDF text extraction.async function pdfParseLazy(buf: Buffer): Promise<{ text: string }> { try { if (!_pdfParse) { const mod = await import('pdf-parse'); _pdfParse = (mod as any).default || (mod as any); } const out = await _pdfParse(buf); return { text: String(out.text || '') }; } catch { return { text: '' }; } }