Skip to main content
Glama

cli_update

Update and manage archived web snapshots in ArchiveBox by executing update commands with options to filter, resume, or index content.

Instructions

Execute archivebox update command.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
resumeNoResume from timestamp
only_newNoUpdate only new snapshots
index_onlyNoIndex without archiving
overwriteNoOverwrite existing files
afterNoFilter snapshots after timestamp
beforeNoFilter snapshots before timestamp
statusNoFilter by statusunarchived
filter_typeNoFilter typesubstring
filter_patternsNoList of filter patterns
extractorsNoComma-separated list of extractors
extra_dataNoAdditional parameters as a dictionary

Implementation Reference

  • The handler function for the MCP 'cli_update' tool. It defines the tool parameters (schema), registers it via @mcp.tool, and implements the logic by calling the ArchiveBox API client's cli_update method.
    @mcp.tool( exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_update( resume: Optional[float] = Field(0, description="Resume from timestamp"), only_new: bool = Field(True, description="Update only new snapshots"), index_only: bool = Field(False, description="Index without archiving"), overwrite: bool = Field(False, description="Overwrite existing files"), after: Optional[float] = Field(0, description="Filter snapshots after timestamp"), before: Optional[float] = Field( 999999999999999, description="Filter snapshots before timestamp" ), status: Optional[str] = Field("unarchived", description="Filter by status"), filter_type: Optional[str] = Field("substring", description="Filter type"), filter_patterns: Optional[List[str]] = Field( None, description="List of filter patterns" ), extractors: Optional[str] = Field( "", description="Comma-separated list of extractors" ), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox update command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_update( resume=resume, only_new=only_new, index_only=index_only, overwrite=overwrite, after=after, before=before, status=status, filter_type=filter_type, filter_patterns=filter_patterns, extractors=extractors, extra_data=extra_data, ) return response.json()
  • Supporting helper method in the ArchiveBox API client class that performs the actual HTTP POST request to the ArchiveBox server's /api/v1/cli/update endpoint to execute the update command.
    @require_auth def cli_update( self, resume: Optional[float] = 0, only_new: bool = True, index_only: bool = False, overwrite: bool = False, after: Optional[float] = 0, before: Optional[float] = 999999999999999, status: Optional[str] = "unarchived", filter_type: Optional[str] = "substring", filter_patterns: Optional[List[str]] = None, extractors: Optional[str] = "", extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox update command Args: resume: Resume from timestamp (default: 0). only_new: Update only new snapshots (default: True). index_only: Index without archiving (default: False). overwrite: Overwrite existing files (default: False). after: Filter snapshots after timestamp (default: 0). before: Filter snapshots before timestamp (default: 999999999999999). status: Filter by status (default: "unarchived"). filter_type: Filter type (default: "substring"). filter_patterns: List of filter patterns (default: ["https://example.com"]). extractors: Comma-separated list of extractors (default: ""). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { k: v for k, v in locals().items() if k != "self" and v is not None and k != "extra_data" } if filter_patterns is None: data["filter_patterns"] = ["https://example.com"] if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/update", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Knuckles-Team/archivebox-api'

If you have feedback or need assistance with the MCP directory API, please join our Discord server