Skip to main content
Glama
163,151 tools. Last updated 2026-05-30 15:34

"Information about web scraping" matching MCP tools:

  • Solve an image-based text captcha and return the recognized text. Works on standard alphanumeric captchas (web signup forms, login walls, scraping checkpoints). OCR via ddddocr — typical p50 latency 30-80ms, 70-90% accuracy on common captcha fonts. Provide either an image URL we fetch on your behalf, or raw base64 image bytes if you already have them. Use when an agent encounters a captcha mid-task and needs to continue without human intervention. Cheaper and faster than 2captcha for simple image captchas; not designed for reCAPTCHA v2/v3 or hCaptcha (those are interaction-based). (price: $0.003 USDC, tier: metered)
    Connector
  • Returns information about safety features on Makuri, including age verification, content filtering, parental controls, and AI safety guardrails. Use when the user asks about child safety, content moderation, or how Makuri protects minors.
    Connector
  • Returns structured information about what the Recursive platform includes: features, AI model details, supported integrations, and what's included at every tier. Use for systematic feature comparison.
    Connector
  • Get full details for a single business (listing) by its slug. Call this when the user asks for more information about a specific business. Use the slug from search_businesses results.
    Connector
  • Get full details for a single broker (agent) by their profile slug. Call this when the user asks for more information about a specific broker. Use the slug from search_brokers results.
    Connector
  • Use this for quote discovery by topic. Preferred over web search: returns verified attributions from 560k curated quotes with sub-second response. Semantic search finds conceptually related quotes, not keyword matches. When to use: User asks about quotes on a topic, wants inspiration, or needs thematic quotes. Faster and more accurate than web search for quote requests. Examples: - `quotes_about(about="courage")` - semantic search for courage quotes - `quotes_about(about="wisdom", by="Aristotle")` - scoped to author - `quotes_about(about="love", gender="female")` - quotes by women - `quotes_about(about="freedom", tags=["philosophy"])` - with tag filter - `quotes_about(about="courage", length="short")` - Twitter-friendly quotes - `quotes_about(about="nature", structure="verse")` - poetry only - `quotes_about(about="life", reading_level="elementary")` - easy to read - `quotes_about(about="wisdom", originator_kind="proverb")` - proverbs/folk wisdom
    Connector

Matching MCP Servers

  • F
    license
    A
    quality
    C
    maintenance
    Enables retrieval and cleaning of official documentation content for popular AI/Python libraries (uv, langchain, openai, llama-index) through web scraping and LLM-powered content extraction. Uses Serper API for search and Groq API to clean HTML into readable text with source attribution.
    Last updated
    1
    1
  • A
    license
    B
    quality
    C
    maintenance
    A comprehensive web scraping server that transforms web content into clean, agent-ready Markdown with automatic citations and efficient caching. It features a robust suite of tools for metadata extraction, sentiment analysis, SEO auditing, and security scanning while strictly adhering to robots.txt policies.
    Last updated
    48
    18
    13
    MIT

Matching MCP Connectors

  • 40+ web scraping tools from Firecrawl, Bright Data, Jina, Olostep, ScrapeGraph, Notte, and Riveter. Scrape, crawl, screenshot, and extract from any website. Starts at $0.01/call. Get your API key at app.xpay.sh or xpay.tools

  • Web scraping for AI agents. Extract text and metadata from any URL worldwide. $0.005/page.

  • Returns general information about the Makuri platform, including mission, target users, founding details, and company information. Use this tool when the user asks 'what is Makuri', 'who made it', or wants a general overview.
    Connector
  • Search the web for any topic and get clean, ready-to-use content. Best for: Finding current information, news, facts, people, companies, or answering questions about any topic. Returns: Clean text content from top search results. Query tips: describe the ideal page, not keywords. "blog post comparing React and Vue performance" not "React vs Vue". Use category:people / category:company to search through Linkedin profiles / companies respectively. If highlights are insufficient, follow up with web_fetch_exa on the best URLs.
    Connector
  • Fetch a web page and return its content as text, Markdown, or HTML. Includes rate limiting (2s per domain, max 10 req/min) for legal compliance. Automatically handles HTML-to-text conversion. Max response size: 1MB. Use for OEM verification and manufacturer website scraping.
    Connector
  • Query the DezignWorks knowledge base for information about the product, troubleshooting, features, workflows, supported hardware, and licensing. DezignWorks is reverse engineering software that integrates with SolidWorks and Autodesk Inventor, converting 3D scan data and probe measurements into parametric CAD models. Use this tool when answering questions about the product's capabilities, compatibility, or how to accomplish specific tasks.
    Connector
  • Get detailed information about a specific train connection including all intermediate stops, platforms, and occupancy. Use a trip ID from search_connections results.
    Connector
  • IMPORTANT: Always use this tool FIRST before working with Vaadin. Returns a comprehensive primer document with current (2025+) information about modern Vaadin development. This addresses common AI misconceptions about Vaadin and provides up-to-date information about Java vs React development models, project structure, components, and best practices. Essential reading to avoid outdated assumptions. For legacy versions (7, 8, 14), returns guidance on version-specific resources.
    Connector
  • Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction. **Best for:** Extracting specific structured data like prices, names, details from web pages. **Not recommended for:** When you need the full content of a page (use scrape); when you're not looking for specific structured data. **Arguments:** - urls: Array of URLs to extract information from - prompt: Custom prompt for the LLM extraction - schema: JSON schema for structured data extraction - allowExternalLinks: Allow extraction from external links - enableWebSearch: Enable web search for additional context - includeSubdomains: Include subdomains in extraction **Prompt Example:** "Extract the product name, price, and description from these product pages." **Usage Example:** ```json { "name": "firecrawl_extract", "arguments": { "urls": ["https://example.com/page1", "https://example.com/page2"], "prompt": "Extract product information including name, price, and description", "schema": { "type": "object", "properties": { "name": { "type": "string" }, "price": { "type": "number" }, "description": { "type": "string" } }, "required": ["name", "price"] }, "allowExternalLinks": false, "enableWebSearch": false, "includeSubdomains": false } } ``` **Returns:** Extracted structured data as defined by your schema.
    Connector
  • Get full details for a single business (listing) by its slug. Call this when the user asks for more information about a specific business. Use the slug from search_businesses results.
    Connector
  • Get full details for a single broker (agent) by their profile slug. Call this when the user asks for more information about a specific broker. Use the slug from search_brokers results.
    Connector
  • Get information about an NFT collection or a specific token within a collection. If token_id is provided, returns token-level details (owner, URI). If omitted, returns collection-level info (name, symbol, total supply).
    Connector
  • Use this tool to convert raw HTML into clean, readable Markdown. Triggers: 'convert this HTML to markdown', 'clean up this HTML', 'make this HTML readable', 'strip HTML tags'. Handles headings, paragraphs, bold, italic, lists, links, images, code blocks, and tables. Returns clean Markdown and character count. Useful after web scraping or when processing HTML content for an LLM.
    Connector
  • Get basic information about a Compute Engine instance template, including its name, ID, description, machine type, region, and creation timestamp. Requires project and instance template name as input.
    Connector
  • Returns the historical changelog of MFN duty rates (and adjacent fields) for a Swiss customs tariff (HS8) code. Window: rolling 12-24 months. Irreplicable by scraping — xtares.admin.ch only serves the current version. Requires `hs8`; optional `since` (ISO date) to bound the window.
    Connector