get_archiveresults
Retrieve archived web content results from ArchiveBox by applying filters like ID, URL, tags, or search terms to find specific snapshots.
Instructions
List all ArchiveResult entries matching these filters.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Filter by ID | |
| search | No | Search across snapshot url, title, tags, extractor, output, id | |
| snapshot_id | No | Filter by snapshot ID | |
| snapshot_url | No | Filter by snapshot URL | |
| snapshot_tag | No | Filter by snapshot tag | |
| status | No | Filter by status | |
| output | No | Filter by output | |
| extractor | No | Filter by extractor | |
| cmd | No | Filter by command | |
| pwd | No | Filter by working directory | |
| cmd_version | No | Filter by command version | |
| created_at | No | Filter by exact creation date (ISO 8601) | |
| created_at__gte | No | Filter by creation date >= (ISO 8601) | |
| created_at__lt | No | Filter by creation date < (ISO 8601) | |
| limit | No | Number of results to return | |
| offset | No | Offset for pagination | |
| page | No | Page number for pagination | |
| api_key_param | No | API key for QueryParamTokenAuth |
Implementation Reference
- archivebox_api/archivebox_mcp.py:331-428 (handler)The primary MCP tool handler for 'get_archiveresults'. Decorated with @mcp.tool, defines input schema via Pydantic Fields, creates an ArchiveBox Api client, calls the client's get_archiveresults method with parameters, and returns the JSON response.@mcp.tool( exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"core"}, ) def get_archiveresults( id: Optional[str] = Field(None, description="Filter by ID"), search: Optional[str] = Field( None, description="Search across snapshot url, title, tags, extractor, output, id", ), snapshot_id: Optional[str] = Field(None, description="Filter by snapshot ID"), snapshot_url: Optional[str] = Field(None, description="Filter by snapshot URL"), snapshot_tag: Optional[str] = Field(None, description="Filter by snapshot tag"), status: Optional[str] = Field(None, description="Filter by status"), output: Optional[str] = Field(None, description="Filter by output"), extractor: Optional[str] = Field(None, description="Filter by extractor"), cmd: Optional[str] = Field(None, description="Filter by command"), pwd: Optional[str] = Field(None, description="Filter by working directory"), cmd_version: Optional[str] = Field(None, description="Filter by command version"), created_at: Optional[str] = Field( None, description="Filter by exact creation date (ISO 8601)" ), created_at__gte: Optional[str] = Field( None, description="Filter by creation date >= (ISO 8601)" ), created_at__lt: Optional[str] = Field( None, description="Filter by creation date < (ISO 8601)" ), limit: int = Field(10, description="Number of results to return"), offset: int = Field(0, description="Offset for pagination"), page: int = Field(0, description="Page number for pagination"), api_key_param: Optional[str] = Field( None, description="API key for QueryParamTokenAuth" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ List all ArchiveResult entries matching these filters. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.get_archiveresults( id=id, search=search, snapshot_id=snapshot_id, snapshot_url=snapshot_url, snapshot_tag=snapshot_tag, status=status, output=output, extractor=extractor, cmd=cmd, pwd=pwd, cmd_version=cmd_version, created_at=created_at, created_at__gte=created_at__gte, created_at__lt=created_at__lt, limit=limit, offset=offset, page=page, api_key=api_key_param, ) return response.json()
- Helper method in the ArchiveBox API client class that performs the actual HTTP GET request to the /api/v1/core/archiveresults endpoint, used by the MCP tool handler.@require_auth def get_archiveresults( self, id: Optional[str] = None, search: Optional[str] = None, snapshot_id: Optional[str] = None, snapshot_url: Optional[str] = None, snapshot_tag: Optional[str] = None, status: Optional[str] = None, output: Optional[str] = None, extractor: Optional[str] = None, cmd: Optional[str] = None, pwd: Optional[str] = None, cmd_version: Optional[str] = None, created_at: Optional[str] = None, created_at__gte: Optional[str] = None, created_at__lt: Optional[str] = None, limit: int = 200, offset: int = 0, page: int = 0, api_key: Optional[str] = None, ) -> requests.Response: """ List all ArchiveResult entries matching these filters Args: id: Filter by ID (startswith, icontains, snapshot-related fields). search: Search across snapshot url, title, tags, extractor, output, id. snapshot_id: Filter by snapshot ID (startswith, icontains). snapshot_url: Filter by snapshot URL (icontains). snapshot_tag: Filter by snapshot tag (icontains). status: Filter by status (exact). output: Filter by output (icontains). extractor: Filter by extractor (icontains). cmd: Filter by command (icontains). pwd: Filter by working directory (icontains). cmd_version: Filter by command version (exact). created_at: Filter by exact creation date (ISO 8601 format). created_at__gte: Filter by creation date >= (ISO 8601 format). created_at__lt: Filter by creation date < (ISO 8601 format). limit: Number of results to return (default: 200). offset: Offset for pagination (default: 0). page: Page number for pagination (default: 0). api_key: API key for QueryParamTokenAuth (optional). Returns: Response: The response object from the GET request. Raises: ParameterError: If the provided parameters are invalid. """ params = { k: v for k, v in locals().items() if k != "self" and v is not None and k != "api_key" } if api_key: params["api_key"] = api_key try: response = self._session.get( url=f"{self.url}/api/v1/core/archiveresults", params=params, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response