150,830 tools. Last updated 2026-05-28 06:19
"A tool to extract or read text and images from PDFs" matching MCP tools:
- Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.Connector
- Fetches a single URL and returns its content. Use this when you have a specific URL in mind — for example, after web.search returns a link you want to read, or when the user pastes a URL. Modes (extract): - 'auto' (default): picks the right mode based on response content type. - 'markdown': for HTML pages; returns cleaned markdown plus the page <title>. - 'text': for JSON/XML/plaintext APIs; returns the raw decoded body. - 'file': for images, PDFs, audio, video, archives, or any binary — ingests the bytes into the user's file storage and returns a file_id you can pass to messages.send (to send as an attachment), agents.add_file (to add to agent knowledge), or files.read. Use web.fetch (not files.upload) when you need the file_id immediately for the next tool call — files.upload(source_url=…) is async and won't have the file ready in the same turn. Use web.search (not web.fetch) when you don't have a specific URL yet and need to find one.Connector
- Extract structured transaction data from a contract at a URL. Downloads the document, extracts text (with OCR fallback for scanned PDFs), and runs PrimaCoda's contract-extraction prompt to return parties, addresses, dates, prices, and key contract fields. Use this when an agent has the contract hosted somewhere (Dropbox, Google Drive direct download, Square Space, etc.) and wants to skip the upload step. For multi-document deals (purchase + addenda + disclosures), use the PrimaCoda dashboard's batch upload — this tool handles ONE document. Args: pdf_url: Direct download URL for the contract (PDF, DOCX, TXT, or image). Must be reachable from the PrimaCoda server. Google Drive "shared link" URLs work if set to "anyone with link"; other share URLs may need their direct-download form. api_key: Your PrimaCoda MCP API key (starts 'pck_').Connector
- Upload a base64-encoded file to a site's container. Use this for binary files (images, archives, fonts, etc.). For text files, prefer write_file(). Requires: API key with write scope. Args: slug: Site identifier path: Relative path including filename (e.g. "images/logo.png") content_b64: Base64-encoded file content Returns: {"success": true, "path": "images/logo.png", "size": 45678} Errors: VALIDATION_ERROR: Invalid base64 encoding FORBIDDEN: Protected system pathConnector
- Retrieves the latest real-time news headlines and article summaries from BBC News and The Guardian across nine topic categories. Returns structured articles with headline, description, source name, article URL, and publication date — sorted most recent first. No API key required. Use this tool when an agent needs current news about a specific topic, wants to summarise today's headlines, needs to research recent events, monitor a subject area for new developments, or build a news briefing. Do not use this tool to read the full content of a specific article — use web_url_reader instead, passing the article URL returned by this tool. Do not use when news from sources outside BBC News and The Guardian is required.Connector
- Retrieves the latest real-time news headlines and article summaries from BBC News and The Guardian across nine topic categories. Returns structured articles with headline, description, source name, article URL, and publication date — sorted most recent first. No API key required. Use this tool when an agent needs current news about a specific topic, wants to summarise today's headlines, needs to research recent events, monitor a subject area for new developments, or build a news briefing. Do not use this tool to read the full content of a specific article — use web_url_reader instead, passing the article URL returned by this tool. Do not use when news from sources outside BBC News and The Guardian is required.Connector
Matching MCP Servers
- FlicenseBqualityBmaintenanceTurn natural language into 3D models in Fusion 360. 64 CAD tools including sketches, extrudes, fillets,and JIS standard parts.Last updated652
- FlicenseBqualityCmaintenanceEnables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.Last updated212
Matching MCP Connectors
Transform any blog post or article URL into ready-to-post social media content for Twitter/X threads, LinkedIn posts, Instagram captions, Facebook posts, and email newsletters. Pay-per-event: $0.07 for all 5 platforms, $0.03 for single platform.
Daily world briefing that tells AI assistants what's actually happening right now. Leaders, conflicts, deaths, economic data, holidays. Updated daily so they stop getting current events wrong.
- Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.Connector
- Fetches clean text from any public HTTPS URL. Use x711_web_search first to find the URL, then this tool to read it. Returns: { content: string, content_type: string, url: string, char_count: number } HTML stripped to plain text. JSON returned as-is. Blocked: localhost, private IPs, .internal domains.Connector
- Read a workspace's doc (TipTap rich-text) body. Format is negotiable via `format`: `markdown` (default — CommonMark + GFM, ready to feed to an LLM or render in a non-ProseMirror surface), `content` (TipTap JSON, round-trippable into update_doc for structural edits), `text` (plain text, best for search, summarisation, word-count heuristics), or `all` for the legacy three-in-one shape. Default is `markdown` because it's the slice agents need 95% of the time and the JSON form on a long doc can blow past the agent harness's tool-result token cap. Pass `format: "content"` only when you're round-tripping into update_doc for a structural edit. A workspace can hold any combination of doc and table surfaces, one or many of either kind; omit `surface_slug` to read the primary doc surface, or pass it to target a specific doc tab (use `list_surfaces` to enumerate). An unwritten or absent doc returns the requested format empty (markdown="", content={}, text=""); a `surface_slug` that doesn't match any live doc surface 404s.Connector
- Delete a single item by id. `kind` MUST match the item type: 'text' for text nodes, 'line' for freehand strokes, 'image' for images — the wrong kind silently targets the wrong table and is a common mistake. Get the id + type from `get_board` (texts[], lines[], images[]). There is no bulk/erase-all tool: loop if you need to delete multiple items.Connector
- Get full catalog metadata for a single scholarly work by OpenAlex id (e.g. 'W2741809807') or DOI (e.g. '10.1038/nature12373'). Returns title, authors, venue, year, citation count, open-access status, and a free full-text URL when available. For a bare arXiv id, use paper_get_text with paper_key 'arxiv:<id>' to read indexed text, or paper_search by title for OpenAlex metadata.Connector
- Upload one or more images to a Wix site's Media Manager. Returns the uploaded file URL (wixstatic.com) and media ID usable in other Wix APIs. ⚠️ You MUST provide image data — calling this tool without image data will fail. ⚠️ NEVER call this tool more than once when uploading multiple images. Always pass ALL images together in a single call using the image array. Choose ONE of the two supported input methods: Option A — image array (use when the user attaches image files OR provides image URLs): Pass siteId + image array with ALL images at once. Each item requires download_url. If you are a ChatGPT/OpenAI client: user-attached files are automatically resolved to download_urls — just pass them in the image array. Even for a single image, wrap it in an array. Option B — imageBase64 (use only when you can read and encode the file yourself): Read the file, encode it as base64, and pass siteId + imageBase64 + mimeType. Supports one image at a time.Connector
- List all generated reports with status and summary info. Returns an array of report objects with id, report_type, status, title, and summary. Use the report id with atlas_get_report for details or atlas_download_report to download completed PDFs. Free.Connector
- Add a document to a deal's data room. Creates the deal if needed. This is the primary way to get documents into Sieve for screening. Upload a pitch deck, financials, or any document -- then call sieve_screen to analyze everything in the data room. Provide company_name to create a new deal (or find existing), or deal_id to add to an existing deal. Provide exactly one content source: file_path (local file), text (raw text/markdown), or url (fetch from URL). Args: title: Document title (e.g. "Pitch Deck Q1 2026"). company_name: Company name -- creates deal if new, finds existing if not. deal_id: Add to an existing deal (from sieve_deals or previous sieve_dataroom_add). website_url: Company website URL (used when creating a new deal). document_type: Type: 'pitch_deck', 'financials', 'legal', or 'other'. file_path: Path to a local file (PDF, DOCX, XLSX). The tool reads and uploads it. text: Raw text or markdown content (alternative to file). url: URL to fetch document from (alternative to file).Connector
- Extract text from PDFs and images as clean Markdown. Uses Mistral OCR — handles complex layouts, tables, handwriting, multi-column documents, and mathematical notation. Preserves document hierarchy in structured Markdown. 10 sats/page. Pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='extract_document' and quantity=pageCount for multi-page PDFs.Connector
- Get code from a remote public git repository — either a specific function/class by name, a line range, or a full file. PREFERRED WORKFLOW: When search results or findings have already identified a specific function, method, or class, use symbol_name to extract just that declaration. This avoids fetching entire files and keeps context focused. Only fetch full files when you need a broad understanding of a file you haven't seen before. For supported languages (Go, Python, TypeScript, JavaScript, Java, C, C++, C#, Kotlin, Swift, Rust) the response includes a symbols list of declarations with line ranges. This is not a first-call tool — use code_analyze or code_search first to identify targets, then extract precisely what you need.Connector
- Analyze an image from a component's datasheet using vision AI. Use this when read_datasheet returns a section containing images and you need to extract data from a graph, package drawing, pin diagram, or circuit schematic. Pass the image_key from the read_datasheet response (the storage path in the image URL). Optionally pass a specific question to focus the analysis. IMPORTANT: For precise numeric values (electrical specs, max ratings), prefer read_datasheet text tables first — they are more reliable than vision-extracted graph data. Use analyze_image for visual information not available in text: package dimensions from drawings, pin assignments from diagrams, graph trends, and approximate values from characteristic curves. Examples: - analyze_image(part_number='IRFZ44N', image_key='images/abc123.png') -> classifies and describes the image - analyze_image(part_number='IRFZ44N', image_key='images/abc123.png', question='What is the drain current at Vgs=5V?')Connector
- Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.Connector
- Fetches any public web page and returns clean, readable plain text stripped of HTML, navigation, scripts, advertisements, and boilerplate. Returns the page title, meta description, word count, and main body text ready for analysis or summarisation. Use this tool when an agent needs to read the content of a specific web page or article URL — for example to summarise an article, extract facts from a page, verify a claim by reading the source, or convert a web page into plain text to pass to another tool. Pass article URLs returned by web_news_headlines to this tool to read full article content. Do not use this tool to discover current news headlines — use web_news_headlines instead. Does not execute JavaScript — best suited for standard HTML content pages. Will not work with paywalled, login-protected, or JavaScript-rendered single-page applications.Connector
- Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.Connector