Skip to main content
Glama

read_pdf_text

Extract and convert PDF document text to markdown format for AI processing. This tool reads PDF content and returns clean text from all pages, simplifying document analysis for agents.

Instructions

Read a PDF document and return only the markdown text content from all pages.

This is a simpler alternative to read_pdf that returns just the text content
without the full OCR metadata, which can be easier for agents to process.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
absolute_pathYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • main.py:128-146 (handler)
    The handler function for the 'read_pdf_text' tool, registered via @mcp.tool(). It invokes Lizeur to perform OCR on the PDF and extracts and concatenates the markdown text from all pages.
    @mcp.tool()
    def read_pdf_text(absolute_path: str) -> str:
        """Read a PDF document and return only the markdown text content from all pages.
    
        This is a simpler alternative to read_pdf that returns just the text content
        without the full OCR metadata, which can be easier for agents to process.
        """
        ocr_response = Lizeur().read_document(Path(absolute_path))
        if ocr_response is None:
            return "Error: Failed to process document"
    
        # Combine all pages' markdown content
        all_text = []
        for i, page in enumerate(ocr_response.pages):
            if hasattr(page, "markdown") and page.markdown:
                all_text.append(f"--- Page {i+1} ---\n{page.markdown}")
    
        return "\n\n".join(all_text) if all_text else "No text content found"
  • main.py:33-64 (helper)
    Helper method in Lizeur class that reads or retrieves cached OCR response for a document path, used by the read_pdf_text handler.
    def read_document(self, path: Path) -> OCRResponse | None:
        """Read a document and return the OCRResponse."""
        logging.info(f"read_document: Reading document {path.name}")
        # Check if the document is already cached
        cached_document_path = self.cache_path / path.name
        if cached_document_path.exists():
            logging.info(f"read_document: Document {path.name} is already cached.")
            try:
                with open(cached_document_path, "r") as f:
                    cached_json = f.read()
                    # Parse JSON and reconstruct OCRResponse
                    cached_data = json.loads(cached_json)
                    return OCRResponse.model_validate(cached_data)
            except (json.JSONDecodeError, ValueError) as e:
                logging.warning(f"Failed to load cached document {path.name}: {e}")
                # Remove corrupted cache file
                cached_document_path.unlink(missing_ok=True)
    
        # OCR the document
        ocr_response = self._ocr_document(path)
        if ocr_response is None:
            return None
    
        # Cache the document using model_dump_json() for direct JSON serialization
        try:
            with open(cached_document_path, "w") as f:
                f.write(ocr_response.model_dump_json(indent=2))
            logging.info(f"Successfully cached document {path.name}")
        except Exception as e:
            logging.error(f"Failed to cache document {path.name}: {e}")
    
        return ocr_response
  • Private helper method in Lizeur class that performs the actual OCR using Mistral AI API, including file upload, processing, and cleanup.
    def _ocr_document(self, path: Path) -> OCRResponse | None:
        """OCR a document and return the OCRResponse."""
        try:
            # Upload the file to MistralAI
            uploaded_file = self.mistral.files.upload(
                file={
                    "file_name": path.stem,
                    "content": path.read_bytes(),
                },
                purpose="ocr",
            )
    
            # Process the uploaded file with OCR
            ocr_response = self.mistral.ocr.process(
                document={
                    "type": "file",
                    "file_id": uploaded_file.id,
                },
                model="mistral-ocr-latest",
                include_image_base64=True,
            )
    
            # Clean up the uploaded file
            try:
                self.mistral.files.delete(uploaded_file.id)
            except Exception as e:
                logging.warning(
                    f"Failed to delete uploaded file {uploaded_file.id}: {e}"
                )
    
            return ocr_response
    
        except Exception as e:
            logging.error(f"OCR processing failed for {path}: {e}")
            return None
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the tool's behavior by stating it reads PDFs and returns markdown text, but lacks details on error handling, performance, or limitations (e.g., file size, supported PDF formats). The description adds some value but doesn't fully cover behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by a concise explanation of its advantage over the sibling tool. Every sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (which handles return values), no annotations, and low complexity, the description is mostly complete. It clearly states the purpose, usage guidelines, and output type. However, it could benefit from more behavioral details (e.g., error cases) to be fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate. It doesn't explicitly mention the 'absolute_path' parameter, but implies it by referring to reading 'a PDF document.' Since there's only one parameter, the baseline is 4, as the description provides enough context to infer the parameter's purpose without detailed semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Read') and resource ('a PDF document'), specifies the output ('markdown text content from all pages'), and explicitly distinguishes it from its sibling tool ('read_pdf') by noting it's a simpler alternative that returns just text without full OCR metadata. This provides specific differentiation and purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool vs. its alternative: 'This is a simpler alternative to read_pdf that returns just the text content without the full OCR metadata, which can be easier for agents to process.' It clearly defines the context for choosing this tool over its sibling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/SilverBzH/lizeur'

If you have feedback or need assistance with the MCP directory API, please join our Discord server