searchscraper

Perform AI-powered web searches to extract structured data from search results for research, competitive analysis, and multi-source information gathering.

Instructions

Perform AI-powered web searches with structured data extraction.

This tool searches the web based on your query and uses AI to extract structured information from the search results. Ideal for research, competitive analysis, and gathering information from multiple sources. Each website searched costs 10 credits (default 3 websites = 30 credits). Read-only operation but results may vary over time (non-idempotent).

Args: user_prompt (str): Search query or natural language instructions for information to find. - Can be a simple search query or detailed extraction instructions - The AI will search the web and extract relevant data from found pages - Be specific about what information you want extracted - Examples: * "Find latest AI research papers published in 2024 with author names and abstracts" * "Search for Python web scraping tutorials with ratings and difficulty levels" * "Get current cryptocurrency prices and market caps for top 10 coins" * "Find contact information for tech startups in San Francisco" * "Search for job openings for data scientists with salary information" - Tips for better results: * Include specific fields you want extracted * Mention timeframes or filters (e.g., "latest", "2024", "top 10") * Specify data types needed (prices, dates, ratings, etc.)

num_results (Optional[int]): Number of websites to search and extract data from. - Default: 3 websites (costs 30 credits total) - Range: 1-20 websites (recommended to stay under 10 for cost efficiency) - Each website costs 10 credits, so total cost = num_results × 10 - Examples: * 1: Quick single-source lookup (10 credits) * 3: Standard research (30 credits) - good balance of coverage and cost * 5: Comprehensive research (50 credits) * 10: Extensive analysis (100 credits) - Note: More results provide broader coverage but increase costs and processing time number_of_scrolls (Optional[int]): Number of infinite scrolls per searched webpage. - Default: 0 (no scrolling on search result pages) - Range: 0-10 scrolls per page - Useful when search results point to pages with dynamic content loading - Each scroll waits for content to load before continuing - Examples: * 0: Static content pages, news articles, documentation * 2: Social media pages, product listings with lazy loading * 5: Extensive feeds, long-form content with infinite scroll - Note: Increases processing time significantly (adds 5-10 seconds per scroll per page)

Returns: Dictionary containing: - search_results: Array of extracted data from each website found - sources: List of URLs that were searched and processed - total_websites_processed: Number of websites successfully analyzed - credits_used: Total credits consumed (num_results × 10) - processing_time: Total time taken for search and extraction - search_query_used: The actual search query sent to search engines - metadata: Additional information about the search process

Raises: ValueError: If user_prompt is empty or num_results is out of range HTTPError: If search engines are unavailable or return errors TimeoutError: If search or extraction process exceeds timeout limits RateLimitError: If too many requests are made in a short time period

Note: - Results may vary between calls due to changing web content (non-idempotent) - Search engines may return different results over time - Some websites may be inaccessible or block automated access - Processing time increases with num_results and number_of_scrolls - Consider using smartscraper on specific URLs if you know the target sites

Input Schema

TableJSON Schema

Name	Required	Description	Default
`user_prompt`	Yes
`num_results`	No
`number_of_scrolls`	No

Implementation Reference

src/scrapegraph_mcp/server.py:1580-1662 (handler)
MCP tool handler function for 'searchscraper'. Registers the tool, defines input schema via type hints and docstring, handles authentication via get_api_key, instantiates ScapeGraphClient, and delegates to the client's searchscraper method.
@mcp.tool(annotations={"readOnlyHint": True, "destructiveHint": False, "idempotentHint": False}) def searchscraper( user_prompt: str, ctx: Context, num_results: Optional[int] = None, number_of_scrolls: Optional[int] = None ) -> Dict[str, Any]: """ Perform AI-powered web searches with structured data extraction. This tool searches the web based on your query and uses AI to extract structured information from the search results. Ideal for research, competitive analysis, and gathering information from multiple sources. Each website searched costs 10 credits (default 3 websites = 30 credits). Read-only operation but results may vary over time (non-idempotent). Args: user_prompt (str): Search query or natural language instructions for information to find. - Can be a simple search query or detailed extraction instructions - The AI will search the web and extract relevant data from found pages - Be specific about what information you want extracted - Examples: * "Find latest AI research papers published in 2024 with author names and abstracts" * "Search for Python web scraping tutorials with ratings and difficulty levels" * "Get current cryptocurrency prices and market caps for top 10 coins" * "Find contact information for tech startups in San Francisco" * "Search for job openings for data scientists with salary information" - Tips for better results: * Include specific fields you want extracted * Mention timeframes or filters (e.g., "latest", "2024", "top 10") * Specify data types needed (prices, dates, ratings, etc.) num_results (Optional[int]): Number of websites to search and extract data from. - Default: 3 websites (costs 30 credits total) - Range: 1-20 websites (recommended to stay under 10 for cost efficiency) - Each website costs 10 credits, so total cost = num_results × 10 - Examples: * 1: Quick single-source lookup (10 credits) * 3: Standard research (30 credits) - good balance of coverage and cost * 5: Comprehensive research (50 credits) * 10: Extensive analysis (100 credits) - Note: More results provide broader coverage but increase costs and processing time number_of_scrolls (Optional[int]): Number of infinite scrolls per searched webpage. - Default: 0 (no scrolling on search result pages) - Range: 0-10 scrolls per page - Useful when search results point to pages with dynamic content loading - Each scroll waits for content to load before continuing - Examples: * 0: Static content pages, news articles, documentation * 2: Social media pages, product listings with lazy loading * 5: Extensive feeds, long-form content with infinite scroll - Note: Increases processing time significantly (adds 5-10 seconds per scroll per page) Returns: Dictionary containing: - search_results: Array of extracted data from each website found - sources: List of URLs that were searched and processed - total_websites_processed: Number of websites successfully analyzed - credits_used: Total credits consumed (num_results × 10) - processing_time: Total time taken for search and extraction - search_query_used: The actual search query sent to search engines - metadata: Additional information about the search process Raises: ValueError: If user_prompt is empty or num_results is out of range HTTPError: If search engines are unavailable or return errors TimeoutError: If search or extraction process exceeds timeout limits RateLimitError: If too many requests are made in a short time period Note: - Results may vary between calls due to changing web content (non-idempotent) - Search engines may return different results over time - Some websites may be inaccessible or block automated access - Processing time increases with num_results and number_of_scrolls - Consider using smartscraper on specific URLs if you know the target sites """ try: api_key = get_api_key(ctx) client = ScapeGraphClient(api_key) return client.searchscraper(user_prompt, num_results, number_of_scrolls) except Exception as e: return {"error": str(e)}
src/scrapegraph_mcp/server.py:178-209 (helper)
Core implementation of searchscraper in ScapeGraphClient class. Constructs POST request to API endpoint https://api.scrapegraphai.com/v1/searchscraper with user_prompt and optional num_results/number_of_scrolls, handles HTTP response and errors.
def searchscraper(self, user_prompt: str, num_results: int = None, number_of_scrolls: int = None) -> Dict[str, Any]: """ Perform AI-powered web searches with structured results. Args: user_prompt: Search query or instructions num_results: Number of websites to search (optional, default: 3 websites = 30 credits) number_of_scrolls: Number of infinite scrolls to perform on each website (optional) Returns: Dictionary containing search results and reference URLs """ url = f"{self.BASE_URL}/searchscraper" data = { "user_prompt": user_prompt } # Add num_results to the request if provided if num_results is not None: data["num_results"] = num_results # Add number_of_scrolls to the request if provided if number_of_scrolls is not None: data["number_of_scrolls"] = number_of_scrolls response = self.client.post(url, headers=self.headers, json=data) if response.status_code != 200: error_msg = f"Error {response.status_code}: {response.text}" raise Exception(error_msg) return response.json()
src/scrapegraph_mcp/server.py:1580-1580 (registration)
FastMCP decorator that registers the searchscraper tool with annotations indicating it's read-only but non-idempotent.
@mcp.tool(annotations={"readOnlyHint": True, "destructiveHint": False, "idempotentHint": False})
src/scrapegraph_mcp/server.py:1581-1662 (schema)
Input schema defined by function signature (user_prompt required str, optional num_results and number_of_scrolls ints) and comprehensive docstring describing parameters, constraints, examples, returns, and errors.
def searchscraper( user_prompt: str, ctx: Context, num_results: Optional[int] = None, number_of_scrolls: Optional[int] = None ) -> Dict[str, Any]: """ Perform AI-powered web searches with structured data extraction. This tool searches the web based on your query and uses AI to extract structured information from the search results. Ideal for research, competitive analysis, and gathering information from multiple sources. Each website searched costs 10 credits (default 3 websites = 30 credits). Read-only operation but results may vary over time (non-idempotent). Args: user_prompt (str): Search query or natural language instructions for information to find. - Can be a simple search query or detailed extraction instructions - The AI will search the web and extract relevant data from found pages - Be specific about what information you want extracted - Examples: * "Find latest AI research papers published in 2024 with author names and abstracts" * "Search for Python web scraping tutorials with ratings and difficulty levels" * "Get current cryptocurrency prices and market caps for top 10 coins" * "Find contact information for tech startups in San Francisco" * "Search for job openings for data scientists with salary information" - Tips for better results: * Include specific fields you want extracted * Mention timeframes or filters (e.g., "latest", "2024", "top 10") * Specify data types needed (prices, dates, ratings, etc.) num_results (Optional[int]): Number of websites to search and extract data from. - Default: 3 websites (costs 30 credits total) - Range: 1-20 websites (recommended to stay under 10 for cost efficiency) - Each website costs 10 credits, so total cost = num_results × 10 - Examples: * 1: Quick single-source lookup (10 credits) * 3: Standard research (30 credits) - good balance of coverage and cost * 5: Comprehensive research (50 credits) * 10: Extensive analysis (100 credits) - Note: More results provide broader coverage but increase costs and processing time number_of_scrolls (Optional[int]): Number of infinite scrolls per searched webpage. - Default: 0 (no scrolling on search result pages) - Range: 0-10 scrolls per page - Useful when search results point to pages with dynamic content loading - Each scroll waits for content to load before continuing - Examples: * 0: Static content pages, news articles, documentation * 2: Social media pages, product listings with lazy loading * 5: Extensive feeds, long-form content with infinite scroll - Note: Increases processing time significantly (adds 5-10 seconds per scroll per page) Returns: Dictionary containing: - search_results: Array of extracted data from each website found - sources: List of URLs that were searched and processed - total_websites_processed: Number of websites successfully analyzed - credits_used: Total credits consumed (num_results × 10) - processing_time: Total time taken for search and extraction - search_query_used: The actual search query sent to search engines - metadata: Additional information about the search process Raises: ValueError: If user_prompt is empty or num_results is out of range HTTPError: If search engines are unavailable or return errors TimeoutError: If search or extraction process exceeds timeout limits RateLimitError: If too many requests are made in a short time period Note: - Results may vary between calls due to changing web content (non-idempotent) - Search engines may return different results over time - Some websites may be inaccessible or block automated access - Processing time increases with num_results and number_of_scrolls - Consider using smartscraper on specific URLs if you know the target sites """ try: api_key = get_api_key(ctx) client = ScapeGraphClient(api_key) return client.searchscraper(user_prompt, num_results, number_of_scrolls) except Exception as e: return {"error": str(e)}

ScrapeGraph MCP Server

searchscraper

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API