cli_schedule
Schedule automated web archiving tasks in ArchiveBox to capture and preserve URLs at specified intervals with customizable crawl depth and tagging.
Instructions
Execute archivebox schedule command.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| import_path | No | Path to import file | |
| add | No | Enable adding new URLs | |
| every | No | Schedule frequency (e.g., 'daily') | |
| tag | No | Comma-separated tags | |
| depth | No | Crawl depth | |
| overwrite | No | Overwrite existing files | |
| update | No | Update existing snapshots | |
| clear | No | Clear existing schedules | |
| extra_data | No | Additional parameters as a dictionary |
Implementation Reference
- archivebox_api/archivebox_mcp.py:710-782 (handler)MCP handler for 'cli_schedule' tool: decorated with @mcp.tool(), defines input schema via Pydantic Fields, creates Api client, calls client.cli_schedule(), and returns JSON response. This is the primary implementation and registration point.@mcp.tool( exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_schedule( import_path: Optional[str] = Field(None, description="Path to import file"), add: bool = Field(False, description="Enable adding new URLs"), every: Optional[str] = Field( None, description="Schedule frequency (e.g., 'daily')" ), tag: str = Field("", description="Comma-separated tags"), depth: int = Field(0, description="Crawl depth"), overwrite: bool = Field(False, description="Overwrite existing files"), update: bool = Field(False, description="Update existing snapshots"), clear: bool = Field(False, description="Clear existing schedules"), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox schedule command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_schedule( import_path=import_path, add=add, every=every, tag=tag, depth=depth, overwrite=overwrite, update=update, clear=clear, extra_data=extra_data, ) return response.json()
- Supporting Api.cli_schedule method: constructs JSON payload from arguments and sends POST request to ArchiveBox server's /api/v1/cli/schedule endpoint.def cli_schedule( self, import_path: Optional[str] = None, add: bool = False, every: Optional[str] = None, tag: str = "", depth: int = 0, overwrite: bool = False, update: bool = False, clear: bool = False, extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox schedule command Args: import_path: Path to import file (optional). add: Enable adding new URLs (default: False). every: Schedule frequency (e.g., "daily"). tag: Comma-separated tags (default: ""). depth: Crawl depth (default: 0). overwrite: Overwrite existing files (default: False). update: Update existing snapshots (default: False). clear: Clear existing schedules (default: False). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { k: v for k, v in locals().items() if k != "self" and v is not None and k != "extra_data" } if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/schedule", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response