Skip to main content
Glama

get_pdf_text

Extract text from PDF files using native text extraction with flexible page range selection. Specify pages or ranges to retrieve content from PDFs containing native text.

Instructions

Extract text from specific pages or page ranges of a PDF file using native text extraction. Supports Python-style slicing: '5' (single page), '5:10' (range), '7:' (from page 7 to end), ':5' (from start to page 5). Use either absolute_path for any location or relative_path for files in ~/pdf-agent/ directory. Note: Works best with PDFs containing native text; scanned PDFs may yield limited results.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
absolute_pathNoAbsolute path to the PDF file (e.g., '/Users/john/documents/report.pdf')
relative_pathNoPath relative to ~/pdf-agent/ directory (e.g., 'reports/annual.pdf')
use_pdf_homeNoUse PDF agent home directory for relative paths (default: true)
page_rangeNoPage range in enhanced Python-style format: '5' (page 5), '5:10' (pages 5-10), '7:' (page 7 to end), ':5' (start to page 5). Also supports comma-separated combinations: '1,3:5,7' (pages 1, 3-5, and 7), '1-3,7,10:' (pages 1-3, 7, and 10 to end). Default: '1:' (all pages)1:
extraction_strategyNoText extraction strategy: 'hybrid' (enhanced native extraction with better error handling), 'native' (standard PDF.js extraction). Default: 'hybrid'hybrid
preserve_formattingNoPreserve text formatting and spacing (default: true)
line_breaksNoPreserve line breaks in extracted text (default: true)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vlad-ds/pdf-agent-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server