Skip to main content
Glama
oxylabs
by oxylabs

oxylabs_scraper

Extract web content using Oxylabs Web Scraper API, enabling customizable parsing and rendering for efficient data retrieval from complex websites.

Instructions

Scrape url using Oxylabs Web Api

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
parseNoShould result be parsed. If result should not be parsed then html will be stripped and converted to markdown file
renderNoWhether a headless browser should be used to render the page. See: https://developers.oxylabs.io/scraper-apis/web-scraper-api/features/javascript-rendering `html` will return rendered html page `None` will not use render for scraping.
urlYesUrl to scrape

Implementation Reference

  • Configuration defining the Oxylabs scraper API endpoint URL.
    OXYLABS_SCRAPER_URL: str = "https://realtime.oxylabs.io/v1/queries"
  • Implementation of the scrape method in _OxylabsClientWrapper class, which sends POST request to Oxylabs scraper API with payload and handles response.
    async def scrape(self, payload: dict[str, typing.Any]) -> dict[str, typing.Any]: await self._ctx.info(f"Create job with params: {json.dumps(payload)}") response = await self._client.post(settings.OXYLABS_SCRAPER_URL, json=payload) response_json: dict[str, typing.Any] = response.json() if response.status_code == status.HTTP_201_CREATED: await self._ctx.info( f"Job info: " f"job_id={response_json['job']['id']} " f"job_status={response_json['job']['status']}" ) response.raise_for_status() return response_json
  • Async context manager providing the Oxylabs HTTP client wrapper used by scraper tools.
    @asynccontextmanager async def oxylabs_client() -> AsyncIterator[_OxylabsClientWrapper]: """Async context manager for Oxylabs client that is used in MCP tools.""" headers = _get_default_headers() username, password = get_oxylabs_auth() if not username or not password: raise ValueError("Oxylabs username and password must be set.") auth = BasicAuth(username=username, password=password) async with AsyncClient( timeout=Timeout(settings.OXYLABS_REQUEST_TIMEOUT_S), verify=True, headers=headers, auth=auth, ) as client: try: yield _OxylabsClientWrapper(client) except HTTPStatusError as e: raise MCPServerError( f"HTTP error during POST request: {e.response.status_code} - {e.response.text}" ) from None except RequestError as e: raise MCPServerError(f"Request error during POST request: {e}") from None except Exception as e: raise MCPServerError(f"Error: {str(e) or repr(e)}") from None

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/oxylabs/oxylabs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server