Skip to main content
Glama
rsi-ai-platform

rsi-search-pro-mcp

Official

pdf_fetch

Download and extract text from PDFs that return binary content in web fetch, enabling access to reports, bulletins, and circulars.

Instructions

Download a PDF directly and extract its text with pypdf.

Use this WHENEVER a `web_fetch` or `web_fetch_structured` call comes
back saying the content was "binary" or "not extractable" — that's
almost always a Tavily limitation on PDFs that are actually text-based
and perfectly extractable with a proper PDF library. Common cases:
PPAC monthly reports, RBI bulletins, MoSPI press release PDFs, PIB
statements, regulator circulars.

Args:
    url: The PDF URL (.pdf in path, or a server that returns
         Content-Type: application/pdf).
    pages: Optional 1-indexed list of pages to extract (e.g. [1, 2, 5]).
           If omitted, the first `max_pages` are extracted.
    max_pages: Cap on auto-extracted pages when `pages` is omitted.

Returns:
    {url, domain, content, fetched_at, page_count, pages_extracted,
     content_truncated, kind: "pdf"}.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
pagesNo
max_pagesNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It transparently describes extraction with pypdf, parameter effects, and return structure. Could mention potential size/rate limits, but overall it's well-covered for a read-only tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with a clear 'when to use' section followed by Args. Every sentence adds value. Slightly verbose but not excessive; could be tightened slightly but still highly effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, no annotations, the description fully explains usage, parameter details, return format, and use cases. Nothing critical is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but description adds full meaning: url as PDF URL, pages as 1-indexed list, max_pages as auto-extraction cap. Explains defaults and behavior when pages is omitted. This compensates completely for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it downloads a PDF and extracts text with pypdf. Distinguishes from siblings like web_fetch by specifying when to use it (when binary/not extractable) and lists common use cases (PPAC reports, RBI bulletins, etc.).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this WHENEVER a web_fetch or web_fetch_structured call comes back saying the content was binary or not extractable', providing clear context and alternatives. The description of common cases further guides correct usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rsi-ai-platform/rsi-search-pro-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server