cli_add
Add URLs to ArchiveBox for web archiving, with options to tag content, control crawl depth, update existing snapshots, and configure extraction methods.
Instructions
Execute archivebox add command.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | List of URLs to archive | |
| tag | No | Comma-separated tags | |
| depth | No | Crawl depth | |
| update | No | Update existing snapshots | |
| update_all | No | Update all snapshots | |
| index_only | No | Index without archiving | |
| overwrite | No | Overwrite existing files | |
| init | No | Initialize collection if needed | |
| extractors | No | Comma-separated list of extractors to use | |
| parser | No | Parser type | auto |
| extra_data | No | Additional parameters as a dictionary |
Implementation Reference
- archivebox_api/archivebox_mcp.py:547-624 (handler)The primary handler and registration for the MCP 'cli_add' tool. This FastMCP @tool-decorated function defines the tool schema via Pydantic Fields and executes the logic by calling the underlying Api.cli_add method.exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_add( urls: List[str] = Field( description="List of URLs to archive", ), tag: str = Field("", description="Comma-separated tags"), depth: int = Field(0, description="Crawl depth"), update: bool = Field(False, description="Update existing snapshots"), update_all: bool = Field(False, description="Update all snapshots"), index_only: bool = Field(False, description="Index without archiving"), overwrite: bool = Field(False, description="Overwrite existing files"), init: bool = Field(False, description="Initialize collection if needed"), extractors: str = Field( "", description="Comma-separated list of extractors to use" ), parser: str = Field("auto", description="Parser type"), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox add command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_add( urls=urls, tag=tag, depth=depth, update=update, update_all=update_all, index_only=index_only, overwrite=overwrite, init=init, extractors=extractors, parser=parser, extra_data=extra_data, ) return response.json()
- Supporting helper: the Api.cli_add method in the client library that performs the HTTP POST to the ArchiveBox server's /api/v1/cli/add endpoint to execute the add command.def cli_add( self, urls: List[str], tag: str = "", depth: int = 0, update: bool = False, update_all: bool = False, index_only: bool = False, overwrite: bool = False, init: bool = False, extractors: str = "", parser: str = "auto", extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox add command Args: urls: List of URLs to archive. tag: Comma-separated tags (default: ""). depth: Crawl depth (default: 0). update: Update existing snapshots (default: False). update_all: Update all snapshots (default: False). index_only: Index without archiving (default: False). overwrite: Overwrite existing files (default: False). init: Initialize collection if needed (default: False). extractors: Comma-separated list of extractors to use (default: ""). parser: Parser type (default: "auto"). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { "urls": urls, "tag": tag, "depth": depth, "update": update, "update_all": update_all, "index_only": index_only, "overwrite": overwrite, "init": init, "extractors": extractors, "parser": parser, } if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/add", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response
- Helper prompt function that generates natural language suggestions for using the cli_add tool.@mcp.prompt def cli_add_prompt( urls: List[str], tag: str = "", depth: int = 0, ) -> str: """ Generates a prompt for executing archivebox add command. """ return f"Add new URLs to ArchiveBox: {urls}, with tags: '{tag}', depth: {depth}. Use the cli_add tool."