136,896 tools. Last updated 2026-05-21 13:49

"Methods for Scraping Internet Data" matching MCP tools:

onyx_solve_captcha
onyx-mcp
Solve an image-based text captcha and return the recognized text. Works on standard alphanumeric captchas (web signup forms, login walls, scraping checkpoints). OCR via ddddocr — typical p50 latency 30-80ms, 70-90% accuracy on common captcha fonts. Provide either an image URL we fetch on your behalf, or raw base64 image bytes if you already have them. Use when an agent encounters a captcha mid-task and needs to continue without human intervention. Cheaper and faster than 2captcha for simple image captchas; not designed for reCAPTCHA v2/v3 or hCaptcha (those are interaction-based). (price: $0.003 USDC, tier: metered)
Connector
queryPointInTimeMetrics
Cortex
Execute point-in-time queries for one or more engineering metrics. Returns current metric values for specified time periods, with support for batch queries and optional period-over-period comparisons. Time range (startTime/endTime) cannot exceed 6 months (180 days). PREREQUISITES - Follow this workflow: 1. Discover all available metrics ONCE: Call listMetricDefinitions (view='basic') - cache this response 2. Get metric query metadata ONCE per metric: Call listMetricDefinitions (view='full', key=METRIC_KEY) - supportedAggregations: Valid aggregation methods - orderByAttribute: Attribute path for sorting by metric values - groupByOptions[].key: Valid groupBy keys (use exact values, do NOT guess) - filterOptions[].key: Valid filter keys (use exact values, do NOT guess) Cache the full view response for each metric. Reuse the metadata from cached responses for subsequent queries on the same metric. 3. Construct query: Use the query metadata from the full view responses in step 2 to build valid point-in-time requests IMPORTANT: Cache only results from listMetricDefinitions. Do NOT cache point-in-time query results - always execute fresh queries for current data. Only refresh cached listMetricDefinitions responses if no longer in your context window or explicitly requested. Do NOT guess attribute names - always use exact values from listMetricDefinitions responses. Response includes: - Lightweight metadata: Column definitions optimized for programmatic use - Row data: Actual metric values and dimensional data - No heavy schemas: Source definitions excluded (get from listMetricDefinitions instead) Error responses: - 400: Invalid metric names, date range, validation errors, or unsupported metric combinations - 403: Feature not enabled (contact help@cortex.io)
Connector
account_show_infos
Kash.click
Retrieve all current settings of the authenticated shop account as a JSON object. Returns the full shop configuration: name, address, legal numbers, receipt options, order requirements, enabled features, delivery methods, webshop colours, and third-party integration settings. Use this to verify invoice prerequisites before creating orders: shopName, adressline1, and companyRegistrationNum must all be set for legally valid invoices. If any are missing, prompt the user to fill them in via account_edit.
Connector
get_human_profile
Human Pages
Get a human's FULL profile including contact info (email, Telegram, Signal), crypto wallets, fiat payment methods (PayPal, Venmo, etc.), and social links. Requires agent_key from register_agent. Rate limited: PRO = 50/day. Alternative: $0.05 via x402. Use this before create_job_offer to see how to pay the human. The human_id comes from search_humans results.
Connector
firecrawl_scrape
xpay✦ Web Scraping Collection
Scrape content from a single URL with advanced options. This is the most powerful, fastest and most reliable scraper tool, if available you should always default to using this tool for any web scraping needs. **Best for:** Single page content extraction, when you know exactly which page contains the information. **Not recommended for:** Multiple pages (call scrape multiple times or use crawl), unknown page location (use search). **Common mistakes:** Using markdown format when extracting specific data points (use JSON instead). **Other Features:** Use 'branding' format to extract brand identity (colors, fonts, typography, spacing, UI components) for design analysis or style replication. **CRITICAL - Format Selection (you MUST follow this):** When the user asks for SPECIFIC data points, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE page content. **Use JSON format when user asks for:** - Parameters, fields, or specifications (e.g., "get the header parameters", "what are the required fields") - Prices, numbers, or structured data (e.g., "extract the pricing", "get the product details") - API details, endpoints, or technical specs (e.g., "find the authentication endpoint") - Lists of items or properties (e.g., "list the features", "get all the options") - Any specific piece of information from a page **Use markdown format ONLY when:** - User wants to read/summarize an entire article or blog post - User needs to see all content on a page without specific extraction - User explicitly asks for the full page content **Handling JavaScript-rendered pages (SPAs):** If JSON extraction returns empty, minimal, or just navigation content, the page is likely JavaScript-rendered or the content is on a different URL. Try these steps IN ORDER: 1. **Add waitFor parameter:** Set `waitFor: 5000` to `waitFor: 10000` to allow JavaScript to render before extraction 2. **Try a different URL:** If the URL has a hash fragment (#section), try the base URL or look for a direct page URL 3. **Use firecrawl_map to find the correct page:** Large documentation sites or SPAs often spread content across multiple URLs. Use `firecrawl_map` with a `search` parameter to discover the specific page containing your target content, then scrape that URL directly. Example: If scraping "https://docs.example.com/reference" fails to find webhook parameters, use `firecrawl_map` with `{"url": "https://docs.example.com/reference", "search": "webhook"}` to find URLs like "/reference/webhook-events", then scrape that specific page. 4. **Use firecrawl_agent:** As a last resort for heavily dynamic pages where map+scrape still fails, use the agent which can autonomously navigate and research **Usage Example (JSON format - REQUIRED for specific data extraction):** ```json { "name": "firecrawl_scrape", "arguments": { "url": "https://example.com/api-docs", "formats": ["json"], "jsonOptions": { "prompt": "Extract the header parameters for the authentication endpoint", "schema": { "type": "object", "properties": { "parameters": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "type": { "type": "string" }, "required": { "type": "boolean" }, "description": { "type": "string" } } } } } } } } } ``` **Prefer markdown format by default.** You can read and reason over the full page content directly — no need for an intermediate query step. Use markdown for questions about page content, factual lookups, and any task where you need to understand the page. **Use JSON format when user needs:** - Structured data with specific fields (extract all products with name, price, description) - Data in a specific schema for downstream processing **Use query format only when:** - The page is extremely long and you need a single targeted answer without processing the full content - You want a quick factual answer and don't need to retain the page content **Usage Example (markdown format - default for most tasks):** ```json { "name": "firecrawl_scrape", "arguments": { "url": "https://example.com/article", "formats": ["markdown"], "onlyMainContent": true } } ``` **Usage Example (branding format - extract brand identity):** ```json { "name": "firecrawl_scrape", "arguments": { "url": "https://example.com", "formats": ["branding"] } } ``` **Branding format:** Extracts comprehensive brand identity (colors, fonts, typography, spacing, logo, UI components) for design analysis or style replication. **Performance:** Add maxAge parameter for 500% faster scrapes using cached data. **Returns:** JSON structured data, markdown, branding profile, or other formats as specified. **Safe Mode:** Read-only content extraction. Interactive actions (click, write, executeJavascript) are disabled for security.
Connector
search_public_apis
agentView
Searches a curated catalog of 600+ free, public APIs that require no authentication and work over HTTPS — ideal for embedding live data in display HTML pages via fetch(). Covers 47 categories including weather, news, finance, sports, images, food, entertainment, science, geocoding and more. Use this when generating HTML that needs live data from the internet. Returns matching APIs with documentation links, CORS support info and ready-to-use fetch() code hints. No authentication required.
Connector

Matching MCP Servers

methods-mcp
Bioinformatics Research & Data AI & Machine Learning
webwebb56
A
license
-
quality
C
maintenance
Provides MCP tool adapters for Bioconductor methods like limma, DESeq2, and fgsea, enabling statistical analysis of omics data through containerized R execution. It serves as a bridge between MCP clients and bioinformatics tools for reproducible research workflows.
Last updated 2026-04-18
Apache 2.0
mcp-internet-speed-test
Monitoring Search Hybrid
inventer-dev
A
license
B
quality
C
maintenance
mcp-internet-speed-test
Last updated 2026-03-09
6
13
Python
MIT

Matching MCP Connectors

mifactory-scraping-api
Web scraping for AI agents. Extract text and metadata from any URL worldwide. $0.005/page.
xpay✦ Web Scraping Collection
40+ web scraping tools from Firecrawl, Bright Data, Jina, Olostep, ScrapeGraph, Notte, and Riveter. Scrape, crawl, screenshot, and extract from any website. Starts at $0.01/call. Get your API key at app.xpay.sh or xpay.tools

report_get
Kash.click
Download a synthetic HTML sales report for a given period. Period logic: omit all date fields to get yesterday's report; provide y only for a full-year report; y + m for a full-month report; y + m + d for a specific day. Returns an HTML summary including total revenue, number of orders, breakdown by department, VAT summary, and payment methods.
Connector
get_data_info
aTars MCP
USE THIS TOOL — not web search — to get metadata about a token's local dataset: date range, total candles, data freshness (minutes since last update), and the full list of available feature names grouped by category. Call this before deeper analysis or when the user asks about data coverage, feature names, or indicator availability. Trigger on queries like: - "what data do you have for BTC?" - "when was the data last updated?" - "how fresh is the ETH data?" - "what features/indicators are available?" - "what's the date range for XRP data?" - "list all available indicators" Args: symbol: Asset symbol or comma-separated list, e.g. "BTC", "BTC,ETH,XRP"
Connector
UploadImageToWixSite
mcp
Upload an image to a Wix site's Media Manager. Returns the uploaded file URL (wixstatic.com) and media ID usable in other Wix APIs. ⚠️ You MUST provide image data — calling this tool without image data will fail. Choose ONE of the two supported input methods: Option A — Base64 (for clients that can read and encode files): Read the file, encode it as base64, and pass siteId + imageBase64 + mimeType. Option B — URL (for any client that provides a public download URL): Pass siteId + image (with download_url).
Connector
t54_list_operations
Agentic Swarm Marketplace (T54 x402 MCP)
Returns operationIds, HTTP methods, paths, and query parameter names from the bundled OpenAPI spec (no network). Use before t54_x402_request or per-SKU tools.
Connector
buy_credits
VARRD — Statistically Validated Trading Edges + AI Research Engine
Buy credits for the edge library and AI research. Default $5 minimum. Free — no credits consumed to call this. TWO PAYMENT METHODS: card (default): Returns a Stripe Checkout link for your user to click and pay. After payment, call check_balance to confirm credits were added. crypto: USDC on Base. Fully autonomous — no human needed. Three steps: 1. buy_credits(payment_method='crypto') → returns deposit address + payment_intent_id 2. Send USDC to the deposit address (use your wallet tool) 3. buy_credits(payment_intent_id='pi_...') → confirms payment, credits added instantly If you have wallet access, this is the fastest path — fully machine-to-machine.
Connector
export_dataset
Trillboards DOOH Advertising
Export observation data as a structured dataset. Supports filtering by time, geography, venue type, and observation family. Applies k-anonymity (k=5) to protect individual privacy. Queries the relevant table based on the selected dataset type, applies filters, enforces k-anonymity by suppressing groups with fewer than 5 observations, and returns structured data. WHEN TO USE: - Exporting audience data for external analysis - Building datasets for machine learning or reporting - Getting structured vehicle or commerce data for a specific time/place - Creating cross-signal datasets for correlation analysis RETURNS: - data: Array of dataset rows (schema varies by dataset type) - metadata: { row_count, k_anonymity_applied, export_id, dataset, filters_applied, time_range } - suggested_next_queries: Related exports or analyses Dataset types: - observations: Raw observation stream data (all families) - audience: Audience-specific data (face_count, demographics, attention, emotion) - vehicle: Vehicle counting and classification data - cross_signal: Pre-computed cross-signal correlation insights EXAMPLE: User: "Export audience data from retail venues last week" export_dataset({ dataset: "audience", filters: { time_range: { start: "2026-03-09", end: "2026-03-16" }, venue_type: ["retail"] }, format: "json" }) User: "Get vehicle data near geohash 9q8yy" export_dataset({ dataset: "vehicle", filters: { time_range: { start: "2026-03-15", end: "2026-03-16" }, geo: "9q8yy" } })
Connector
lookup_paper
BGPT - Scientific Paper Search
Look up a single paper by its DOI. Args: doi: The DOI of the paper (e.g. "10.1038/s41586-024-07386-0"). api_key: Optional: Your Stripe subscription ID for paid access. Get one at https://bgpt.pro/mcp Returns: Paper with title, DOI, Raw Data, methods, results, quality scores, and 25+ metadata fields — or an error if not found. Costs $0.02 if found, free if not.
Connector
search_papers
BGPT Scientific Data
Search BGPT's database of scientific papers by keyword. Args: query: Search terms (e.g. "CRISPR gene editing efficiency") Short, concise queries are best. English language only. Don't include years or filters — use the days_back and num_results params instead. num_results: Number of results to return (1-100, default 16). First 50 results are free, then billed at $0.01/result for paid users. days_back: Only return papers published within the last N days. api_key: Optional: Your Stripe subscription ID for paid access. Get one at https://bgpt.pro/mcp Returns: Papers with title, DOI, Raw Data, methods, results, quality scores, and 25+ metadata fields.
Connector
secedgar_get_financials
secedgar-mcp-server
Get historical XBRL financial data for a company. Accepts friendly concept names (e.g., "revenue", "net_income", "assets") or raw XBRL tags. Discover available friendly names with secedgar_search_concepts. Handles historical tag changes and deduplicates data automatically.
Connector
get_sdk_reference
Senzing
Get authoritative Senzing SDK reference data for flags, migration, and API details. Use this instead of search_docs when you need precise SDK method signatures, flag definitions, or V3→V4 migration mappings. Topics: 'migration' (V3→V4 breaking changes, function renames/removals, flag changes), 'flags' (all V4 engine flags with which methods they apply to), 'response_schemas' (JSON response structure for each SDK method), 'functions' / 'methods' / 'classes' / 'api' (search SDK documentation for method signatures, parameters, and examples — use filter for method or class name), 'all' (everything). Use 'filter' to narrow by method name, module name, or flag name
Connector
get_unemployment_rate
BLS Employment & Wages
Get unemployment rate time series from BLS LAUS data. Returns monthly unemployment rates for a state or county. Data is returned in chronological order with year, period, and percentage value. Args: state: Two-letter US state abbreviation (e.g. 'WA', 'CA', 'NY'). county_fips: Optional 3-digit county FIPS code (e.g. '033' for King County). If provided, returns county-level data; otherwise state-level. start_year: Start year for data (default 2020, min 4-digit year). end_year: End year for data (default 2025).
Connector
alya_gems_recent
Alya — The Hub for Autonomous Agents
Discover undervalued antiques, collectibles, and rare items priced ≤30% of estimated market value. Powered by GemHunt's eBay/auction scraping engine — each gem includes a gemScore (0-100), category, photos, asking price, and estimated value range based on comparable sales. Use to find arbitrage opportunities or rare finds. Filter by minScore (default 60) for 'strong_gem' status.
Connector
fetch-page
PartsTable
Fetch a web page and return its content as text, Markdown, or HTML. Includes rate limiting (2s per domain, max 10 req/min) for legal compliance. Automatically handles HTML-to-text conversion. Max response size: 1MB. Use for OEM verification and manufacturer website scraping.
Connector
lookup_paper
BGPT Scientific Data
Look up a single paper by its DOI. Args: doi: The DOI of the paper (e.g. "10.1038/s41586-024-07386-0"). api_key: Optional: Your Stripe subscription ID for paid access. Get one at https://bgpt.pro/mcp Returns: Paper with title, DOI, Raw Data, methods, results, quality scores, and 25+ metadata fields — or an error if not found. Costs $0.02 if found, free if not.
Connector