mcp-pdf2md

Overview Schema Related Servers Score Discussions

convert_pdf_url

Convert PDF files from URLs to Markdown format with optional OCR support for extracting text from scanned documents.

Instructions

Convert PDF URL to Markdown, supports single URL or URL list

Args:
    url: PDF file URL or URL list, can be separated by spaces, commas, or newlines
    enable_ocr: Whether to enable OCR (default: True)

Returns:
    dict: Conversion result information

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes
`enable_ocr`	No

Implementation Reference

src/pdf2md/server.py:315-393 (handler)

The core handler function implementing the logic for convert_pdf_url tool. It parses input URLs, submits batch jobs to MinerU API, polls for completion, downloads and extracts Markdown results.

async def convert_pdf_url(url: str, enable_ocr: bool = True) -> Dict[str, Any]:
    """
    Convert PDF URL to Markdown, supports single URL or URL list
    
    Args:
        url: PDF file URL or URL list, can be separated by spaces, commas, or newlines
        enable_ocr: Whether to enable OCR (default: True)

    Returns:
        dict: Conversion result information
    """
    if not MINERU_API_KEY:
        return {"success": False, "error": "Missing API key, please set environment variable MINERU_API_KEY"}
    
    if isinstance(url, str):
        urls = parse_url_string(url)
    else:
        urls = [url]  
    
    async with httpx.AsyncClient(timeout=300.0) as client:
        try:
            files = []
            for i, url_item in enumerate(urls):
                files.append({
                    "url": url_item, 
                    "is_ocr": enable_ocr, 
                    "data_id": f"url_convert_{i+1}_{int(time.time())}"
                })
            
            batch_data = {
                "enable_formula": True,
                "language": "auto",
                "layout_model": "doclayout_yolo",
                "enable_table": True,
                "files": files
            }
            
            response = await client.post(
                MINERU_BATCH_API,
                headers=HEADERS,
                json=batch_data,
                timeout=300.0
            )
            
            if response.status_code != 200:
                return {"success": False, "error": f"Request failed: {response.status_code}"}
            
            try:
                status_data = response.json()
                
                if status_data.get("code") != 0 and status_data.get("code") != 200:
                    error_msg = status_data.get("msg", "Unknown error")
                    return {"success": False, "error": f"API returned error: {error_msg}"}
                    
                batch_id = status_data.get("data", {}).get("batch_id", "")
                if not batch_id:
                    return {"success": False, "error": "Failed to get batch ID"}
                
                task_status = await check_task_status(client, batch_id)
                
                if not task_status.get("success"):
                    return task_status
                
                downloaded_files = await download_batch_results(client, task_status.get("extract_results", []))
                
                return {
                    "success": True, 
                    "downloaded_files": downloaded_files,
                    "batch_id": batch_id,
                    "total_urls": len(urls),
                    "processed_urls": len(downloaded_files)
                }
                
            except json.JSONDecodeError as e:
                return {"success": False, "error": f"Failed to parse JSON: {e}"}
                
        except Exception as e:
            return {"success": False, "error": str(e)}

src/pdf2md/server.py:314-314 (registration)
The @mcp.tool() decorator registers the convert_pdf_url function as an MCP tool.
```
@mcp.tool()  
```

src/pdf2md/server.py:316-325 (schema)

Docstring providing input schema (parameters) and output description for the tool.

"""
Convert PDF URL to Markdown, supports single URL or URL list

Args:
    url: PDF file URL or URL list, can be separated by spaces, commas, or newlines
    enable_ocr: Whether to enable OCR (default: True)

Returns:
    dict: Conversion result information
"""

src/pdf2md/server.py:243-275 (helper)

Helper function used by the handler to parse input URL strings into a list, handling quotes, spaces, commas, and newlines.

def parse_url_string(url_string):
    """
    Parse URL string separated by spaces, commas, or newlines
    
    Args:
        url_string: URL string
        
    Returns:
        list: List of URLs
    """
    if isinstance(url_string, str):
        if (url_string.startswith('"') and url_string.endswith('"')) or \
           (url_string.startswith("'") and url_string.endswith("'")):
            url_string = url_string[1:-1]
    
    urls = []
    for part in url_string.split():
        if ',' in part:
            urls.extend(part.split(','))
        elif '\n' in part:
            urls.extend(part.split('\n'))
        else:
            urls.append(part)
    
    cleaned_urls = []
    for url in urls:
        if (url.startswith('"') and url.endswith('"')) or \
           (url.startswith("'") and url.endswith("'")):
            cleaned_urls.append(url[1:-1])
        else:
            cleaned_urls.append(url)
    
    return cleaned_urls

src/pdf2md/server.py:73-136 (helper)

Helper function to poll the MinerU API for batch task completion status.

async def check_task_status(client, batch_id, max_retries=60, sleep_seconds=5):
    """
    Check batch task status
    
    Args:
        client: HTTP client
        batch_id: Batch ID
        max_retries: Maximum number of retries
        sleep_seconds: Seconds between retries
        
    Returns:
        dict: Dictionary containing task status information, or error message if failed
    """
    retry_count = 0
    
    while retry_count < max_retries:
        retry_count += 1
        
        try:
            status_response = await client.get(
                f"{MINERU_BATCH_RESULTS_API}/{batch_id}",
                headers=HEADERS,
                timeout=60.0  
            )
            
            if status_response.status_code != 200:
                retry_count += 1
                if retry_count < max_retries:
                    await asyncio.sleep(sleep_seconds)
                continue
            
            try:
                status_data = status_response.json()
            except Exception as e:
                retry_count += 1
                if retry_count < max_retries:
                    await asyncio.sleep(sleep_seconds)
                continue
            
            task_data = status_data.get("data", {})
            extract_results = task_data.get("extract_result", [])
            
            all_done, any_done = print_task_status(extract_results)
            
            if all_done:
                return {
                    "success": True,
                    "extract_results": extract_results,
                    "task_data": task_data,
                    "status_data": status_data
                }
            
            await asyncio.sleep(sleep_seconds)
            
        except Exception as e:
            retry_count += 1
            if retry_count < max_retries:
                await asyncio.sleep(sleep_seconds)
    
    return {
        "success": False,
        "error": "Polling timeout, unable to get final results"
    }

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions OCR support with a default setting, which adds some context, but fails to describe critical behaviors such as rate limits, authentication requirements, error handling, or what the conversion result information includes. For a tool that processes external URLs and performs conversion, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with the core purpose followed by parameter details in a structured format. Every sentence adds value, with no redundant information. However, the use of 'dict' in the returns section is slightly vague, though this is mitigated by the lack of an output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (processing PDF URLs with OCR options) and the absence of annotations and output schema, the description is minimally adequate. It covers the basic purpose and parameters but lacks details on behavioral traits, error cases, and output structure. This leaves gaps that could hinder an agent's ability to use the tool effectively in varied contexts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds meaningful semantics beyond the input schema, which has 0% description coverage. It explains that 'url' can be a single URL or a list separated by spaces, commas, or newlines, and clarifies the default value and purpose of 'enable_ocr'. This compensates well for the schema's lack of descriptions, making the parameters understandable without relying on the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: converting PDF URLs to Markdown format. It specifies the resource (PDF URLs) and the action (convert to Markdown), which is specific and actionable. However, it doesn't explicitly differentiate from its sibling tool 'convert_pdf_file' beyond mentioning URL vs. file handling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by mentioning support for single URLs or URL lists, but it doesn't provide explicit guidance on when to use this tool versus alternatives like 'convert_pdf_file'. No when-not-to-use scenarios or prerequisites are mentioned, leaving the agent to infer context from the tool name and description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FutureUnreal/mcp-pdf2md'

If you have feedback or need assistance with the MCP directory API, please join our Discord server