cli_schedule
Schedule automated web archiving tasks to regularly capture and preserve website snapshots at specified intervals with customizable crawl settings.
Instructions
Execute archivebox schedule command.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| import_path | No | Path to import file | |
| add | No | Enable adding new URLs | |
| every | No | Schedule frequency (e.g., 'daily') | |
| tag | No | Comma-separated tags | |
| depth | No | Crawl depth | |
| overwrite | No | Overwrite existing files | |
| update | No | Update existing snapshots | |
| clear | No | Clear existing schedules | |
| extra_data | No | Additional parameters as a dictionary |
Implementation Reference
- archivebox_api/archivebox_mcp.py:710-783 (handler)The primary MCP tool handler for 'cli_schedule', decorated with @mcp.tool. It creates an ArchiveBox Api client using provided or env credentials and invokes the cli_schedule method to execute the schedule command via POST to the API endpoint, returning the JSON response.@mcp.tool( exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_schedule( import_path: Optional[str] = Field(None, description="Path to import file"), add: bool = Field(False, description="Enable adding new URLs"), every: Optional[str] = Field( None, description="Schedule frequency (e.g., 'daily')" ), tag: str = Field("", description="Comma-separated tags"), depth: int = Field(0, description="Crawl depth"), overwrite: bool = Field(False, description="Overwrite existing files"), update: bool = Field(False, description="Update existing snapshots"), clear: bool = Field(False, description="Clear existing schedules"), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox schedule command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_schedule( import_path=import_path, add=add, every=every, tag=tag, depth=depth, overwrite=overwrite, update=update, clear=clear, extra_data=extra_data, ) return response.json()
- Supporting helper function in the ArchiveBox API client class that constructs the request data from parameters and sends a POST request to the /api/v1/cli/schedule endpoint.def cli_schedule( self, import_path: Optional[str] = None, add: bool = False, every: Optional[str] = None, tag: str = "", depth: int = 0, overwrite: bool = False, update: bool = False, clear: bool = False, extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox schedule command Args: import_path: Path to import file (optional). add: Enable adding new URLs (default: False). every: Schedule frequency (e.g., "daily"). tag: Comma-separated tags (default: ""). depth: Crawl depth (default: 0). overwrite: Overwrite existing files (default: False). update: Update existing snapshots (default: False). clear: Clear existing schedules (default: False). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { k: v for k, v in locals().items() if k != "self" and v is not None and k != "extra_data" } if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/schedule", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response