Skip to main content
Glama
260,860 tools. Last updated 2026-07-05 08:54

"A search for information related to PDF files" matching MCP tools:

  • Search the ShippingRates database by keyword — matches against carrier names, port names, country names, and charge types. Use this for exploratory queries when you don't know exact codes. For example, search "mumbai" to find port codes, or "hapag" to find Hapag-Lloyd data coverage. Returns matching trade lanes, local charges, and shipping line information. FREE — no payment required. Returns: { trade_lanes: [...], local_charges: [...], lines: [...] } matching the keyword. Related tools: Use shippingrates_port for structured port lookup by UN/LOCODE, shippingrates_lines for full carrier listing.
    Connector
  • Merge multiple PDF files into a single document. Preserves bookmarks, links, and formatting. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Minimum 2 files, no maximum. Files are concatenated in array order. 100 sats per merge regardless of file count. Use convert_file instead if you need format conversion (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key, no account needed. Requires create_payment with toolName='merge_pdfs'.
    Connector
  • Convert HTML or Markdown to a pixel-perfect PDF. Returns JSON: { url } — a temporary download URL (valid ~1 hour). Great for generating invoices, reports, receipts, or formatted documents programmatically. Supports full HTML/CSS including tables, images (base64 or URL), and inline styles. For Markdown input, set format='markdown'. 50 sats per conversion. Use convert_file instead for converting existing files between formats (e.g., DOCX→PDF). Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='convert_html_to_pdf'.
    Connector
  • Download a PDF from a URL and extract all text content, page by page. Use this to read the full text of a specific document — for example, an annual report PDF linked from a search_filings result. Best combined with search_filings: use search_filings to locate the document, then parse_pdf_to_text for the full text. Do not use for PDFs that are already well-represented in the database — search_filings is faster and returns pre-ranked, relevant excerpts. Not suitable for scanned (image-only) PDFs without embedded text; those pages will be returned as "(no extractable text)". Args: pdf_url: Direct HTTPS URL to the PDF file, e.g. https://example.com/report.pdf. Must be publicly accessible; authentication-protected URLs will fail. Returns: All text from the PDF with "--- Page N ---" separators between pages. Returns an error string if the download fails, the URL does not point to a valid PDF, or the document exceeds the 60-second download timeout.
    Connector
  • Answer questions using knowledge base (uploaded documents, handbooks, files). Use for QUESTIONS that need an answer synthesized from documents or messages. Returns an evidence pack with source citations, KG entities, and extracted numbers. Modes: - 'auto' (default): Smart routing — works for most questions - 'rag': Semantic search across documents & messages - 'entity': Entity-centric queries (e.g., 'Tell me about [entity]') - 'relationship': Two-entity queries (e.g., 'How is [entity A] related to [entity B]?') Examples: - 'What did we discuss about the budget?' → knowledge.query - 'Tell me about [entity]' → knowledge.query mode=entity - 'How is [A] related to [B]?' → knowledge.query mode=relationship NOT for finding/listing files, threads, or links — use search.files / search.threads / search.links for that.
    Connector
  • Upload a file (base64) and attach it to a page (editor+) — an image, PDF, dataset, etc. Returns the serve URL plus a ready-to-paste `markdown` snippet; then call update_page or patch_page to place it in the body (images render inline as ![](…), other files as a download card). The payload is inline base64 and rides through the model's context, so it is capped at 5 MB — keep it to small files (screenshots, charts, short PDFs). For larger files use request_attachment_upload (a direct PUT URL, bytes off-context), or the tela editor (drag-drop).
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Hosted MCP server: convert PDFs to clean, LLM-ready Markdown with tables, formulas and OCR.

  • Markdown to PDF: headings, bold, code, lists, rules. A4/Letter/Legal. Free 30/hr. MCP + REST.

  • General search tool. This is your FIRST entry point to look up for possible tokens, entities, and addresses related to a query. Do NOT use this tool for prediction markets. For Polymarket names, topics, event slugs, or URLs, use `prediction_market_lookup` instead. Nansen MCP does not support NFTs, however check using this tool if the query relates to a token. Regular tokens and NFTs can have the same name. This tool allows you to: - Check if a (fungible) token exists by name, symbol, or contract address - Search information about a token - Current price in USD - Trading volume - Contract address and chain information - Market cap and supply data when available - Search information about an entity - Find Nansen labels of an address (EOA) or resolve a domain (.eth, .sol)
    Connector
  • Search the ShippingRates database by keyword — matches against carrier names, port names, country names, and charge types. Use this for exploratory queries when you don't know exact codes. For example, search "mumbai" to find port codes, or "hapag" to find Hapag-Lloyd data coverage. Returns matching trade lanes, local charges, and shipping line information. FREE — no payment required. Returns: { trade_lanes: [...], local_charges: [...], lines: [...] } matching the keyword. Related tools: Use shippingrates_port for structured port lookup by UN/LOCODE, shippingrates_lines for full carrier listing.
    Connector
  • Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
    Connector
  • Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
    Connector
  • Search the web and optionally extract content from search results. This is the most powerful web search tool available, and if available you should always default to using this tool for any web search needs. The query also supports search operators, that you can use if needed to refine the search: | Operator | Functionality | Examples | ---|-|-| | `""` | Non-fuzzy matches a string of text | `"Firecrawl"` | `-` | Excludes certain keywords or negates other operators | `-bad`, `-site:firecrawl.dev` | `site:` | Only returns results from a specified website | `site:firecrawl.dev` | `inurl:` | Only returns results that include a word in the URL | `inurl:firecrawl` | `allinurl:` | Only returns results that include multiple words in the URL | `allinurl:git firecrawl` | `intitle:` | Only returns results that include a word in the title of the page | `intitle:Firecrawl` | `allintitle:` | Only returns results that include multiple words in the title of the page | `allintitle:firecrawl playground` | `related:` | Only returns results that are related to a specific domain | `related:firecrawl.dev` | `imagesize:` | Only returns images with exact dimensions | `imagesize:1920x1080` | `larger:` | Only returns images larger than specified dimensions | `larger:1920x1080` **Best for:** Finding specific information across multiple websites, when you don't know which website has the information; when you need the most relevant content for a query. **Not recommended for:** When you need to search the filesystem. When you already know which website to scrape (use scrape); when you need comprehensive coverage of a single website (use map or crawl. **Common mistakes:** Using crawl or map for open-ended questions (use search instead). **Prompt Example:** "Find the latest research papers on AI published in 2023." **Sources:** web, images, news, default to web unless needed images or news. **Categories:** Optional filter to limit result types: `github` (GitHub repositories, code, issues, and docs), `research` (academic and research sources), `pdf` (PDF results). Example: `categories: ["github", "research"]`. **Domain filters:** Use includeDomains to restrict results to specific domains, or excludeDomains to remove domains. Do not use both in the same request. Domains must be hostnames only, without protocol or path. **Scrape Options:** Only use scrapeOptions when you think it is absolutely necessary. When you do so default to a lower limit to avoid timeouts, 5 or lower. **Optimal Workflow:** Search first using firecrawl_search without formats, then after fetching the results, use the scrape tool to get the content of the relevantpage(s) that you want to scrape **After the search:** Once you have processed the results (or decided they were not useful), call `firecrawl_search_feedback` with the `id` from this response. The first feedback per search refunds 1 credit and helps Firecrawl improve search quality. **Usage Example without formats (Preferred):** ```json { "name": "firecrawl_search", "arguments": { "query": "top AI companies", "limit": 5, "includeDomains": ["example.com"], "sources": [ { "type": "web" } ] } } ``` **Usage Example with formats:** ```json { "name": "firecrawl_search", "arguments": { "query": "latest AI research papers 2023", "limit": 5, "categories": ["github", "research"], "lang": "en", "country": "us", "sources": [ { "type": "web" }, { "type": "images" }, { "type": "news" } ], "scrapeOptions": { "formats": ["markdown"], "onlyMainContent": true } } } ``` **Returns:** A JSON envelope of the form `{ success, data: { web?, images?, news? }, id, creditsUsed }`. Each result array contains the search results (with optional scraped content). Pass the top-level `id` to `firecrawl_search_feedback` after you've used the results.
    Connector
  • Parse a file using Firecrawl's /v2/parse endpoint. In local/non-cloud MCP mode, this tool reads filePath from the MCP server filesystem and posts multipart data to the configured self-hosted FIRECRAWL_API_URL, preserving the existing direct-read behavior. In hosted CLOUD_SERVICE mode, this tool is a two-call flow because hosted MCP cannot read your local filesystem: 1. Call with filePath, contentType, parse options, and optional declaredSizeBytes. The hosted server mints a short-lived upload URL and returns a safe local curl PUT command plus nextToolCall. 2. Run the returned curl command locally, then call firecrawl_parse again with uploadRef and the desired parse options. The hosted server calls /v2/parse server-side with your session credential. **Best for:** Extracting content from a local document (PDF, Word, Excel, HTML, etc.); pulling structured data out of a file with JSON format; converting binary documents into markdown for downstream reasoning. **Not recommended for:** Remote URLs (use firecrawl_scrape); multiple files at once (call parse multiple times); documents that require interactive actions, screenshots, or change tracking — those aren't supported by the parse endpoint. **Common mistakes:** In hosted mode, do not pass both filePath and uploadRef. Phase 1 uses filePath only to generate upload instructions; phase 2 uses uploadRef only to parse server-side. **Supported file types:** .html, .htm, .xhtml, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls **Unsupported options:** actions, screenshot/branding/changeTracking formats, waitFor > 0, location, mobile, proxy values other than "auto" or "basic". **Privacy:** Set `redactPII: true` to return content with personally identifiable information redacted. **CRITICAL - Format Selection (same rules as firecrawl_scrape):** When the user asks for SPECIFIC data points from a document, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE document content. **Handling PDFs:** Add `"parsers": ["pdf"]` (optionally with `pdfOptions.maxPages`) when parsing a PDF so the PDF engine is invoked explicitly. For very long documents, cap `maxPages` to keep the response within token limits. **Hosted phase 1 example:** ```json { "name": "firecrawl_parse", "arguments": { "filePath": "/absolute/path/to/document.pdf", "contentType": "application/pdf", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Hosted phase 2 example:** ```json { "name": "firecrawl_parse", "arguments": { "uploadRef": "upload-ref-from-phase-1", "formats": ["markdown"], "parsers": ["pdf"], "zeroDataRetention": true } } ``` **Returns:** Phase 1 hosted upload instructions or a parsed document with markdown, html, links, summary, json, or query results depending on the requested formats.
    Connector
  • "PDB entry [1abc] details" / "fetch protein structure [pdb_id]" / "metadata for [PDB ID]" — full PDB entry record by ID (e.g. "1abc", "7BV2"). Returns experimental method (X-ray / cryo-EM / NMR), resolution, authors, deposition date, organism, ligands, related entities. Use after `search` to inspect a specific structure.
    Connector
  • Use when a user wants a SHAREABLE, branded multi-page Site Analysis PDF for ONE lat/lon (a powered-land parcel, a candidate campus) — the polished client deliverable, not just a score. Example: "Make the Site Analysis PDF for this Carrier Mills parcel, 150 MW, for TON Infrastructure." — generate_site_analysis lat=37.694 lon=-88.65 capacity_mw=150 prepared_for="TON Infrastructure" prepared_by="Martone Advisors". Params: lat (-90 to 90, required), lon (-180 to 180, required), capacity_mw (target load MW, e.g. 50-500), prepared_for (client name on the cover), prepared_by (your firm — brands the report; defaults to DC Hub), latency_target (optional metro override; default = nearest real carrier hotel). Returns: {survey:{verdict, power/transmission, gas, water, air-permitting, fiber carriers, latency-to-nearest-carrier-hotel, market, tax}, pdf_report_url}. pdf_report_url is a ready-to-open link to download the branded 5-page PDF — no login needed, valid ~7 days; hand it to your human. For just the numeric suitability score (no PDF), use analyze_site instead.
    Connector
  • General search tool. This is your FIRST entry point to look up for possible tokens, entities, and addresses related to a query. Do NOT use this tool for prediction markets. For Polymarket names, topics, event slugs, or URLs, use `prediction_market_lookup` instead. Nansen MCP does not support NFTs, however check using this tool if the query relates to a token. Regular tokens and NFTs can have the same name. This tool allows you to: - Check if a (fungible) token exists by name, symbol, or contract address - Search information about a token - Current price in USD - Trading volume - Contract address and chain information - Market cap and supply data when available - Search information about an entity - Find Nansen labels of an address (EOA) or resolve a domain (.eth, .sol)
    Connector
  • Find working SOURCE CODE examples from 37 indexed Senzing GitHub repositories. REQUIRED: either `query` (string, for search) or `repo` with `file_path` or `list_files=true` — the call WILL FAIL without one. Three modes: (1) Search: pass `query` to find examples across all repos, (2) File listing: pass `repo` + `list_files=true`, (3) File retrieval: pass `repo` + `file_path`. Indexes source code (.py, .java, .cs, .rs) and READMEs — NOT build/data files. For sample data, use get_sample_data. Covers Python, Java, C#, Rust SDK patterns: initialization, ingestion, search, redo, configuration, message queues, REST APIs. Use max_lines to limit large files. Returns GitHub raw URLs for file retrieval.
    Connector
  • Fetch and convert a Microsoft Learn documentation webpage to markdown format. This tool retrieves the latest complete content of Microsoft documentation webpages including Azure, .NET, Microsoft 365, and other Microsoft technologies. ## When to Use This Tool - When search results provide incomplete information or truncated content - When you need complete step-by-step procedures or tutorials - When you need troubleshooting sections, prerequisites, or detailed explanations - When search results reference a specific page that seems highly relevant - For comprehensive guides that require full context ## Usage Pattern Use this tool AFTER microsoft_docs_search when you identify specific high-value pages that need complete content. The search tool gives you an overview; this tool gives you the complete picture. ## URL Requirements - The URL must be a valid HTML documentation webpage from the microsoft.com domain - Binary files (PDF, DOCX, images, etc.) are not supported ## Output Format markdown with headings, code blocks, tables, and links preserved.
    Connector
  • Return an inline PDF artifact from supplied report_meta, tables, metrics, and summary content; this read-only renderer does not persist hosted files. Use this only when a structured report payload already exists; use report_docx_generate for editable Word output or compliance_edd_report to build the memo first.
    Connector
  • Extract plain text from a PDF or image (base64-encoded). Use when you need raw text for downstream AI analysis (summarization, claim checking, structured extraction). For documents at a public URL, use extract_url instead (no base64 encoding needed). Returns: { pages: number, text: string } Example prompts: - "Extract the text from this scanned contract so I can search it." - "Give me the raw text from this PDF document." - "OCR this image and return the text content."
    Connector
  • Extract plain text from a PDF or image (base64-encoded). Use when you need raw text for downstream AI analysis (summarization, claim checking, structured extraction). For documents at a public URL, use extract_url instead (no base64 encoding needed). Returns: { pages: number, text: string } Example prompts: - "Extract the text from this scanned contract so I can search it." - "Give me the raw text from this PDF document." - "OCR this image and return the text content."
    Connector