Skip to main content
Glama

cli_add

Add URLs to an ArchiveBox web archive with options for crawling depth, tagging, updating snapshots, and customizing extraction methods.

Instructions

Execute archivebox add command.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to archive
tagNoComma-separated tags
depthNoCrawl depth
updateNoUpdate existing snapshots
update_allNoUpdate all snapshots
index_onlyNoIndex without archiving
overwriteNoOverwrite existing files
initNoInitialize collection if needed
extractorsNoComma-separated list of extractors to use
parserNoParser typeauto
extra_dataNoAdditional parameters as a dictionary

Implementation Reference

  • The cli_add MCP tool handler, registered with @mcp.tool decorator. Defines input schema using Pydantic Field annotations and executes the tool logic by creating an ArchiveBox API client and calling its cli_add method to add URLs for archiving.
    @mcp.tool( exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_add( urls: List[str] = Field( description="List of URLs to archive", ), tag: str = Field("", description="Comma-separated tags"), depth: int = Field(0, description="Crawl depth"), update: bool = Field(False, description="Update existing snapshots"), update_all: bool = Field(False, description="Update all snapshots"), index_only: bool = Field(False, description="Index without archiving"), overwrite: bool = Field(False, description="Overwrite existing files"), init: bool = Field(False, description="Initialize collection if needed"), extractors: str = Field( "", description="Comma-separated list of extractors to use" ), parser: str = Field("auto", description="Parser type"), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox add command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_add( urls=urls, tag=tag, depth=depth, update=update, update_all=update_all, index_only=index_only, overwrite=overwrite, init=init, extractors=extractors, parser=parser, extra_data=extra_data, ) return response.json()
  • Helper method in the ArchiveBox API client class that performs the actual HTTP POST request to the ArchiveBox server's /api/v1/cli/add endpoint to add URLs.
    def cli_add( self, urls: List[str], tag: str = "", depth: int = 0, update: bool = False, update_all: bool = False, index_only: bool = False, overwrite: bool = False, init: bool = False, extractors: str = "", parser: str = "auto", extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox add command Args: urls: List of URLs to archive. tag: Comma-separated tags (default: ""). depth: Crawl depth (default: 0). update: Update existing snapshots (default: False). update_all: Update all snapshots (default: False). index_only: Index without archiving (default: False). overwrite: Overwrite existing files (default: False). init: Initialize collection if needed (default: False). extractors: Comma-separated list of extractors to use (default: ""). parser: Parser type (default: "auto"). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { "urls": urls, "tag": tag, "depth": depth, "update": update, "update_all": update_all, "index_only": index_only, "overwrite": overwrite, "init": init, "extractors": extractors, "parser": parser, } if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/add", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response
  • MCP prompt helper that generates instructional text referencing the cli_add tool for adding URLs.
    @mcp.prompt def cli_add_prompt( urls: List[str], tag: str = "", depth: int = 0, ) -> str: """ Generates a prompt for executing archivebox add command. """ return f"Add new URLs to ArchiveBox: {urls}, with tags: '{tag}', depth: {depth}. Use the cli_add tool."

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Knuckles-Team/archivebox-api'

If you have feedback or need assistance with the MCP directory API, please join our Discord server