Skip to main content
Glama

cli_add

Add URLs to ArchiveBox for web archiving, with options to tag content, control crawl depth, update existing snapshots, and configure extraction methods.

Instructions

Execute archivebox add command.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to archive
tagNoComma-separated tags
depthNoCrawl depth
updateNoUpdate existing snapshots
update_allNoUpdate all snapshots
index_onlyNoIndex without archiving
overwriteNoOverwrite existing files
initNoInitialize collection if needed
extractorsNoComma-separated list of extractors to use
parserNoParser typeauto
extra_dataNoAdditional parameters as a dictionary

Implementation Reference

  • The primary handler and registration for the MCP 'cli_add' tool. This FastMCP @tool-decorated function defines the tool schema via Pydantic Fields and executes the logic by calling the underlying Api.cli_add method.
    exclude_args=[ "archivebox_url", "username", "password", "token", "api_key", "verify", ], tags={"cli"}, ) def cli_add( urls: List[str] = Field( description="List of URLs to archive", ), tag: str = Field("", description="Comma-separated tags"), depth: int = Field(0, description="Crawl depth"), update: bool = Field(False, description="Update existing snapshots"), update_all: bool = Field(False, description="Update all snapshots"), index_only: bool = Field(False, description="Index without archiving"), overwrite: bool = Field(False, description="Overwrite existing files"), init: bool = Field(False, description="Initialize collection if needed"), extractors: str = Field( "", description="Comma-separated list of extractors to use" ), parser: str = Field("auto", description="Parser type"), extra_data: Optional[Dict] = Field( None, description="Additional parameters as a dictionary" ), archivebox_url: str = Field( default=os.environ.get("ARCHIVEBOX_URL", None), description="The URL of the ArchiveBox instance", ), username: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_USERNAME", None), description="Username for authentication", ), password: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_PASSWORD", None), description="Password for authentication", ), token: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_TOKEN", None), description="Bearer token for authentication", ), api_key: Optional[str] = Field( default=os.environ.get("ARCHIVEBOX_API_KEY", None), description="API key for authentication", ), verify: Optional[bool] = Field( default=to_boolean(os.environ.get("ARCHIVEBOX_VERIFY", "True")), description="Whether to verify SSL certificates", ), ) -> dict: """ Execute archivebox add command. """ client = Api( url=archivebox_url, username=username, password=password, token=token, api_key=api_key, verify=verify, ) response = client.cli_add( urls=urls, tag=tag, depth=depth, update=update, update_all=update_all, index_only=index_only, overwrite=overwrite, init=init, extractors=extractors, parser=parser, extra_data=extra_data, ) return response.json()
  • Supporting helper: the Api.cli_add method in the client library that performs the HTTP POST to the ArchiveBox server's /api/v1/cli/add endpoint to execute the add command.
    def cli_add( self, urls: List[str], tag: str = "", depth: int = 0, update: bool = False, update_all: bool = False, index_only: bool = False, overwrite: bool = False, init: bool = False, extractors: str = "", parser: str = "auto", extra_data: Optional[Dict] = None, ) -> requests.Response: """ Execute archivebox add command Args: urls: List of URLs to archive. tag: Comma-separated tags (default: ""). depth: Crawl depth (default: 0). update: Update existing snapshots (default: False). update_all: Update all snapshots (default: False). index_only: Index without archiving (default: False). overwrite: Overwrite existing files (default: False). init: Initialize collection if needed (default: False). extractors: Comma-separated list of extractors to use (default: ""). parser: Parser type (default: "auto"). extra_data: Additional parameters as a dictionary (optional). Returns: Response: The response object from the POST request. Raises: ParameterError: If the provided parameters are invalid. """ data = { "urls": urls, "tag": tag, "depth": depth, "update": update, "update_all": update_all, "index_only": index_only, "overwrite": overwrite, "init": init, "extractors": extractors, "parser": parser, } if extra_data: data.update(extra_data) try: response = self._session.post( url=f"{self.url}/api/v1/cli/add", json=data, headers=self.headers, verify=self.verify, ) except ValidationError as e: raise ParameterError(f"Invalid parameters: {e.errors()}") return response
  • Helper prompt function that generates natural language suggestions for using the cli_add tool.
    @mcp.prompt def cli_add_prompt( urls: List[str], tag: str = "", depth: int = 0, ) -> str: """ Generates a prompt for executing archivebox add command. """ return f"Add new URLs to ArchiveBox: {urls}, with tags: '{tag}', depth: {depth}. Use the cli_add tool."

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Knuckles-Team/archivebox-api'

If you have feedback or need assistance with the MCP directory API, please join our Discord server