Skip to main content
Glama

convert_pdf_url

Convert PDF files from URLs to structured Markdown format using OCR technology, preserving document structure and extracting images for easy editing and sharing.

Instructions

Convert a PDF from a URL to Markdown. The output is saved in the directory specified by --output-dir.

Args:
    url: A single PDF URL or multiple URLs separated by spaces, commas, or newlines.

Returns:
    A dictionary with the conversion results.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Registers the convert_pdf_url function as an MCP tool using the FastMCP decorator.
    @mcp.tool()
  • The core implementation of the convert_pdf_url tool. Downloads the PDF from the URL, encodes it to base64, processes it with Mistral OCR API, saves the resulting markdown and any images to a subdirectory in OUTPUT_DIR.
    async def convert_pdf_url(url: str) -> Dict[str, Any]:
        """
        Convert a PDF from a URL to Markdown. The output is saved in the directory specified by --output-dir.
    
        Args:
            url: A single PDF URL or multiple URLs separated by spaces, commas, or newlines.
    
        Returns:
            A dictionary with the conversion results.
        """
        if not MISTRAL_API_KEY:
            return {"success": False, "error": "Missing API key, please set environment variable MISTRAL_API_KEY"}
    
        try:
            client = Mistral(api_key=MISTRAL_API_KEY)
        except Exception as e:
            return {"success": False, "error": f"Error initializing Mistral client: {e}"}
    
        urls = parse_input_string(url)
        results = []
    
        async with httpx.AsyncClient(timeout=120.0) as http_client:
            for u in urls:
                try:
                    response = await http_client.get(u, follow_redirects=True)
                    response.raise_for_status()
                    base64_pdf = base64.b64encode(response.content).decode('utf-8')
    
                    pdf_name = Path(u.split('?')[0]).stem
                    # Create a specific subdirectory for this URL's content
                    output_dir = Path(OUTPUT_DIR) / pdf_name
                    output_dir.mkdir(parents=True, exist_ok=True)
    
                    output_md_path = output_dir / f"{pdf_name}.md"
    
                    ocr_response = client.ocr.process(
                        model="mistral-ocr-latest",
                        document={"type": "document_url", "document_url": f"data:application/pdf;base64,{base64_pdf}"},
                        include_image_base64=True
                    )
    
                    markdown_content, saved_images = save_ocr_response_to_markdown_and_images(
                        ocr_response, output_md_path, output_dir
                    )
    
                    if markdown_content is not None:
                        results.append({
                            "url": u,
                            "success": True,
                            "markdown_file": str(output_md_path),
                            "images": saved_images,
                            "output_directory": str(output_dir),
                            "content_length": len(markdown_content)
                        })
                    else:
                        results.append({"url": u, "success": False, "error": "Could not save markdown or images."})
    
                except httpx.RequestError as e:
                    results.append({"url": u, "success": False, "error": f"Failed to download URL: {e}"})
                except Exception as e:
                    results.append({"url": u, "success": False, "error": f"Error processing URL '{u}': {e}"})
    
        return {"success": any(r.get("success", False) for r in results), "results": results}
  • Docstring providing input/output schema: input is str url (supports multiple), output is dict with 'success' bool and 'results' list of per-URL dicts containing paths and status.
    """
    Convert a PDF from a URL to Markdown. The output is saved in the directory specified by --output-dir.
    
    Args:
        url: A single PDF URL or multiple URLs separated by spaces, commas, or newlines.
    
    Returns:
        A dictionary with the conversion results.
    """
  • Helper utility to parse the input url string into a list of individual URLs, handling quotes and separators.
    def parse_input_string(input_string: str) -> List[str]:
        """Parses a string of paths or URLs separated by spaces, commas, or newlines."""
        if (input_string.startswith('"') and input_string.endswith('"')) or \
           (input_string.startswith("'") and input_string.endswith("'")):
            input_string = input_string[1:-1]
        items = " ".join(input_string.replace(",", " ").split()).split()
        cleaned_items = []
        for item in items:
            if (item.startswith('"') and item.endswith('"')) or \
               (item.startswith("'") and item.endswith("'")):
                cleaned_items.append(item[1:-1])
            else:
                cleaned_items.append(item)
        return [item for item in cleaned_items if item]
  • Helper function to process OCR response: writes markdown from pages to file, saves base64 images using save_image, returns content and image paths.
    def save_ocr_response_to_markdown_and_images(ocr_response, output_md_path, output_dir_for_images):
        """
        Saves the markdown content from each page of the OCR response to a file
        and saves any associated images.
        """
        full_markdown_content = []
        saved_images = []
        try:
            with open(output_md_path, "wt", encoding='utf-8') as f:
                for page in ocr_response.pages:
                    f.write(page.markdown)
                    full_markdown_content.append(page.markdown)
                    for image in page.images:
                        saved_image_path = save_image(image, output_dir_for_images)
                        if saved_image_path:
                            saved_images.append(saved_image_path)
            return "".join(full_markdown_content), saved_images
        except Exception as e:
            print(f"Error saving markdown file '{output_md_path}' or processing images: {e}")
            return None, []
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses key behavioral traits: the conversion output is saved to a directory (--output-dir), and it handles multiple URLs. However, it misses details like error handling, rate limits, authentication needs, or file format specifics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the core purpose, followed by specific sections for Args and Returns. Every sentence adds value without redundancy, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (conversion with output handling), no annotations, and an output schema present, the description is mostly complete. It covers purpose, parameters, and output behavior, but lacks some contextual details like prerequisites or error scenarios that could enhance agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate. It adds meaningful semantics: the 'url' parameter can be a single URL or multiple URLs separated by spaces, commas, or newlines, which clarifies usage beyond the basic schema. It doesn't detail URL validation or examples, keeping it from a perfect score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: converting PDFs from URLs to Markdown with specific output behavior. It uses precise verbs ('Convert', 'saved') and distinguishes from the sibling tool convert_pdf_file by specifying URL-based input rather than file-based input.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implicitly indicates when to use this tool (for URL-based PDF conversion) versus the sibling convert_pdf_file (for file-based conversion). However, it lacks explicit guidance on when not to use it or detailed alternatives beyond the sibling tool name.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zicez/mcp-pdf2md'

If you have feedback or need assistance with the MCP directory API, please join our Discord server