searchscraper
Perform AI-powered web searches to extract structured data from search results for research, competitive analysis, and multi-source information gathering.
Instructions
Perform AI-powered web searches with structured data extraction.
This tool searches the web based on your query and uses AI to extract structured information from the search results. Ideal for research, competitive analysis, and gathering information from multiple sources. Each website searched costs 10 credits (default 3 websites = 30 credits). Read-only operation but results may vary over time (non-idempotent).
Args: user_prompt (str): Search query or natural language instructions for information to find. - Can be a simple search query or detailed extraction instructions - The AI will search the web and extract relevant data from found pages - Be specific about what information you want extracted - Examples: * "Find latest AI research papers published in 2024 with author names and abstracts" * "Search for Python web scraping tutorials with ratings and difficulty levels" * "Get current cryptocurrency prices and market caps for top 10 coins" * "Find contact information for tech startups in San Francisco" * "Search for job openings for data scientists with salary information" - Tips for better results: * Include specific fields you want extracted * Mention timeframes or filters (e.g., "latest", "2024", "top 10") * Specify data types needed (prices, dates, ratings, etc.)
num_results (Optional[int]): Number of websites to search and extract data from.
- Default: 3 websites (costs 30 credits total)
- Range: 1-20 websites (recommended to stay under 10 for cost efficiency)
- Each website costs 10 credits, so total cost = num_results × 10
- Examples:
* 1: Quick single-source lookup (10 credits)
* 3: Standard research (30 credits) - good balance of coverage and cost
* 5: Comprehensive research (50 credits)
* 10: Extensive analysis (100 credits)
- Note: More results provide broader coverage but increase costs and processing time
number_of_scrolls (Optional[int]): Number of infinite scrolls per searched webpage.
- Default: 0 (no scrolling on search result pages)
- Range: 0-10 scrolls per page
- Useful when search results point to pages with dynamic content loading
- Each scroll waits for content to load before continuing
- Examples:
* 0: Static content pages, news articles, documentation
* 2: Social media pages, product listings with lazy loading
* 5: Extensive feeds, long-form content with infinite scroll
- Note: Increases processing time significantly (adds 5-10 seconds per scroll per page)Returns: Dictionary containing: - search_results: Array of extracted data from each website found - sources: List of URLs that were searched and processed - total_websites_processed: Number of websites successfully analyzed - credits_used: Total credits consumed (num_results × 10) - processing_time: Total time taken for search and extraction - search_query_used: The actual search query sent to search engines - metadata: Additional information about the search process
Raises: ValueError: If user_prompt is empty or num_results is out of range HTTPError: If search engines are unavailable or return errors TimeoutError: If search or extraction process exceeds timeout limits RateLimitError: If too many requests are made in a short time period
Note: - Results may vary between calls due to changing web content (non-idempotent) - Search engines may return different results over time - Some websites may be inaccessible or block automated access - Processing time increases with num_results and number_of_scrolls - Consider using smartscraper on specific URLs if you know the target sites
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| user_prompt | Yes | ||
| num_results | No | ||
| number_of_scrolls | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||