smartcrawler_fetch_results
Retrieve results from an asynchronous web crawling operation by polling with a request ID until completion, providing structured data or markdown content from processed webpages.
Instructions
Retrieve the results of an asynchronous SmartCrawler operation.
This tool fetches the results from a previously initiated crawling operation using the request_id. The crawl request processes asynchronously in the background. Keep polling this endpoint until the status field indicates 'completed'. While processing, you'll receive status updates. Read-only operation that safely retrieves results without side effects.
Args: request_id: The unique request ID returned by smartcrawler_initiate. Use this to retrieve the crawling results. Keep polling until status is 'completed'. Example: 'req_abc123xyz'
Returns: Dictionary containing: - status: Current status of the crawl operation ('processing', 'completed', 'failed') - results: Crawled data (structured extraction or markdown) when completed - metadata: Information about processed pages, URLs visited, and processing statistics Keep polling until status is 'completed' to get final results
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| request_id | Yes |
Implementation Reference
- src/scrapegraph_mcp/server.py:1822-1847 (handler)MCP tool handler: Wraps the client method, gets API key from context, creates client, calls smartcrawler_fetch_results on it, handles exceptions by returning error dict.def smartcrawler_fetch_results(request_id: str, ctx: Context) -> Dict[str, Any]: """ Retrieve the results of an asynchronous SmartCrawler operation. This tool fetches the results from a previously initiated crawling operation using the request_id. The crawl request processes asynchronously in the background. Keep polling this endpoint until the status field indicates 'completed'. While processing, you'll receive status updates. Read-only operation that safely retrieves results without side effects. Args: request_id: The unique request ID returned by smartcrawler_initiate. Use this to retrieve the crawling results. Keep polling until status is 'completed'. Example: 'req_abc123xyz' Returns: Dictionary containing: - status: Current status of the crawl operation ('processing', 'completed', 'failed') - results: Crawled data (structured extraction or markdown) when completed - metadata: Information about processed pages, URLs visited, and processing statistics Keep polling until status is 'completed' to get final results """ try: api_key = get_api_key(ctx) client = ScapeGraphClient(api_key) return client.smartcrawler_fetch_results(request_id) except Exception as e: return {"error": str(e)}
- ScapeGraphClient helper method: Makes GET request to API endpoint /crawl/{request_id}, returns JSON response or raises exception on error.def smartcrawler_fetch_results(self, request_id: str) -> Dict[str, Any]: """ Fetch the results of a SmartCrawler operation. Args: request_id: The request ID returned by smartcrawler_initiate Returns: Dictionary containing the crawled data (structured extraction or markdown) and metadata about processed pages Note: It takes some time to process the request and returns the results. Meanwhile it returns the status of the request. You have to keep polling the smartcrawler_fetch_results until the request is complete. The request is complete when the status is "completed". and you get results Keep polling the smartcrawler_fetch_results until the request is complete. """ endpoint = f"{self.BASE_URL}/crawl/{request_id}" response = self.client.get(endpoint, headers=self.headers) if response.status_code != 200: error_msg = f"Error {response.status_code}: {response.text}" raise Exception(error_msg) return response.json()
- src/scrapegraph_mcp/server.py:1822-1822 (registration)FastMCP tool registration decorator that registers the function as the 'smartcrawler_fetch_results' tool.def smartcrawler_fetch_results(request_id: str, ctx: Context) -> Dict[str, Any]: