Skip to main content
Glama
127,227 tools. Last updated 2026-05-05 10:33

"A tool for reading and extracting content from PDF papers" matching MCP tools:

  • Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.
    Connector
  • Find quantum computing researchers and potential collaborators from 1000+ active profiles. Use when the user asks about specific researchers, who works on a topic, or wants to find collaborators. NOT for jobs (use searchJobs) or papers (use searchPapers). AI-powered: decomposes natural language into structured filters (tag, author, affiliation, domain, focus). Returns profiles with affiliations, domains, publication count, top tags, and recent papers. Data from arXiv papers published in the last 12 months. Max 50 results. Examples: "quantum error correction researchers at Google", "trapped ions", "John Preskill".
    Connector
  • Primary tool for reading a filing's content. Pass a `document_id` from `list_filings` / `get_financials`. MANDATORY for any substantive answer - filing metadata (dates, form codes, descriptions) alone doesn't answer the user; the numbers and text live inside the document. ── RESPONSE SHAPES ── • `kind='embedded'` (PDF up to ~20 MB; structured text up to `max_bytes`): returns `bytes_base64` with the full document, `source_url_official` (evergreen registry URL for citation, auto-resolved), and `source_url_direct` (short-TTL signed proxy URL). For PDFs the host converts bytes into a document content block - you read it natively including scans. • `kind='resource_link'` (document exceeds `max_bytes`): NO `bytes_base64`. Returns `reason`, `next_steps`, the two source URLs, plus `index_preview` for PDFs (`{page_count, text_layer, outline_present, index_status}`). Use the navigation tools below. ── WORKFLOW FOR kind='resource_link' ── 1. Read `index_preview.text_layer`. Values: `full` (every page has real text), `partial` (mixed), `none` (scanned / image-only), `oversized_skipped` (indexing skipped), `encrypted` / `failed`. 2. If `full` / `partial`: call `get_document_navigation` (outline + previews + landmarks) and/or `search_document` to locate pages. If `none` / `oversized_skipped`: skip search. 3. Call `fetch_document_pages(pages='N-M', format='pdf'|'text'|'png')` to get actual content. Prefer `pdf` for citations, `text` for skim, `png` for scanned or oversized. ── CRITICAL RULES ── • **Navigation-aids-only**: previews, snippets, landmark matches, and outline titles returned by the navigation tools are for LOCATING pages. NEVER cite them as source material - quote only from `fetch_document_pages` output or this tool's inline bytes. • **No fallback to memory**: if this tool fails (rate limit, 5xx, disconnect), do NOT fill in names / numbers / dates from training data. Tell the user what failed and offer retry or `source_url_official`. • Don't reflexively retry with a larger `max_bytes` - for big PDFs the bytes are unreadable to you anyway. Use the navigation tools instead. `source_url_official` is auto-resolved from a session-side cache populated by the most recent `list_filings` call. The optional `company_id` / `transaction_id` / `filing_type` / `filing_description` inputs are OVERRIDES for the rare case where `document_id` didn't come through `list_filings`. Per-country document availability, format, and pricing - call `list_jurisdictions({jurisdiction:"<code>"})`.
    Connector
  • Return specific pages of a PDF in one of three formats: • format='pdf' - pdf-lib page slice, preserves the original text layer and fonts (no re-encoding). This is the ONLY format that gives you byte-exact, citation-grade content. Use this for financial numbers, legal quotes, and any answer requiring precision. • format='text' - raw extracted text from pdfjs. Machine-readable but NOT authoritative - OCR errors on bad-quality text layers can silently garble digits. Use only for summarisation / light reading, and cross-check numbers by re-fetching with format='pdf'. • format='png' - page rasterization via Cloudflare Browser Rendering, for documents with text_layer='none' (scanned PDFs). Phase 6 - may return 'not implemented' in current deployment. The response includes at most 100 pages (Anthropic document-block hard cap). Split larger ranges into multiple calls. Requires the document's bytes to already be cached - call fetch_document on the full document first if this is a new filing.
    Connector
  • Download a completed report as PDF. Returns base64-encoded PDF content. Confirm report status='completed' via atlas_get_report(report_id) first. report_id from atlas_start_report response or atlas_list_reports. Free.
    Connector
  • Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Send transactional pdfs for AI agents via SMTP. Templates included.

  • The verified hub for conferences and journals. Powered by AI to match your scholarly ambitions with the world's most prestigious academic opportunities.

  • List all 197 papers in the Urantia Book with their metadata (id, title, partId, labels). Use toc.get for a hierarchical view instead.
    Connector
  • Schedule multiple posts at once from CSV content. USE THIS WHEN: • User has a spreadsheet or list of posts to schedule • Planning a content calendar for a month • Migrating content from another tool CSV FORMAT (required columns): • platform: linkedin, instagram, x, tiktok, threads • scheduled_time: ISO 8601 format (e.g., 2024-02-15T10:00:00Z) • text: Post content/caption OPTIONAL COLUMNS: • media_url: Image or video URL • first_comment: First comment to add (Instagram/LinkedIn) • hashtags: Additional hashtags to append PROCESS: 1. First call with validate_only: true to check for errors 2. Review validation report with user 3. Call again with validate_only: false to execute import
    Connector
  • Get today's quantum computing papers from arXiv — no parameters needed. Use when the user asks "what's new in quantum computing?" or wants a daily paper briefing. Returns the most recent day's papers with title, authors, date, AI-generated hook (one-line summary), and tags. For date-range or topic-filtered search, use searchPapers instead. Use getPaperDetails for full abstract and analysis of a specific paper.
    Connector
  • Create a job description from text within a hiring context. Returns a JD object with 'id' and stored content. Use JD content as jd_text in atlas_fit_match, atlas_fit_rank, atlas_start_jd_fit_batch, and atlas_start_jd_analysis. Requires context_id from atlas_create_context or atlas_list_contexts. Free.
    Connector
  • Search quantum computing research papers from arXiv. Use when the user asks about recent research, specific papers, or academic topics in quantum computing. NOT for jobs (use searchJobs) or researcher profiles (use searchCollaborators). Supports natural language queries decomposed via AI into structured filters (topic, tag, author, affiliation, domain). Date range defaults to last 7 days; max lookback 12 months. Returns newest first, max 50 results. Use getPaperDetails for full abstract and analysis of a specific paper. Examples: "trapped ion papers from Google", "QEC review papers this month", "quantum error correction".
    Connector
  • Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.
    Connector
  • Stake SOL with Blueprint validator in a single call. Builds the transaction, signs it with your secret key in-memory, and submits to Solana. Returns the confirmed transaction signature. Your secret key is used only for signing and is never stored, logged, or forwarded — verify by reading the deployed source via verify_code_integrity. This is the recommended tool for autonomous agents.
    Connector
  • Starts a crawl job on a website and extracts content from all pages. **Best for:** Extracting content from multiple related pages, when you need comprehensive coverage. **Not recommended for:** Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow). **Warning:** Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control. **Common mistakes:** Setting limit or maxDiscoveryDepth too high (causes token overflow) or too low (causes missing pages); using crawl for a single page (use scrape instead). Using a /* wildcard is not recommended. **Prompt Example:** "Get all blog posts from the first two levels of example.com/blog." **Usage Example:** ```json { "name": "firecrawl_crawl", "arguments": { "url": "https://example.com/blog/*", "maxDiscoveryDepth": 5, "limit": 20, "allowExternalLinks": false, "deduplicateSimilarURLs": true, "sitemap": "include" } } ``` **Returns:** Operation ID for status checking; use firecrawl_check_crawl_status to check progress. **Safe Mode:** Read-only crawling. Webhooks and interactive actions are disabled for security.
    Connector
  • Explain the Guard product using CurrencyGuard's approved product and FAQ content. Use this for any question about what the Guard is, how it works, who it is for, how it compares to forwards or options, and for any legal, regulatory, accounting, or eligibility question. Do not answer those questions from memory — always call this tool.
    Connector
  • List available markdown holdings reports for Bulgarian pension funds. Reports contain detailed portfolio holdings data extracted from official PDF filings and converted to structured markdown with metadata (allocation %, exposure, top holdings). Use this tool to discover what reports are available before loading specific ones with `read_holdings_report`. Filter by manager, fund type, or date range.
    Connector
  • Retrieve a shipment document (commercial invoice) as binary PDF. **IMPORTANT:** This tool returns only metadata (content type and size) because MCP cannot transmit binary data. For usable document links, prefer calling `get_shipment` with `format="URL"` instead — it returns clickable download URLs. Only use this tool if you specifically need to confirm a document exists or check its file size. Required authorization scope: `public.shipment_document:read` Args: easyship_shipment_id: The Easyship shipment ID, e.g. "ESSG10006001". document_type: The type of document to retrieve. Must be "commercial_invoice". page_size: Page size for the document: "4x6" or "A4". Default: "A4". Returns: Metadata only (content type and size). For downloadable URLs, use `get_shipment` with format="URL".
    Connector
  • Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.
    Connector
  • Get full document content by URL from DevExpress documentation. Use this tool to retrieve the complete markdown content of a specific documentation page. PREREQUISITE: ALWAYS call `devexpress_docs_search` before using this tool to get valid URLs. The URL parameter must be obtained from the results of the `devexpress_docs_search` tool.
    Connector
  • Get report status and metadata (without PDF). Returns status (pending/processing/completed/failed), title, type, inputs, and summary. This is the polling tool for ceevee_generate_report — call every 30 seconds, up to 40 times (20 min max). When status='completed', download PDF with ceevee_download_report(report_id). If status='failed', relay error_message. If still processing after 40 polls, stop and give the user the report_id to check later. Free.
    Connector