Skip to main content
Glama

push

Upload quantized model files and model card to HuggingFace Hub. Provide repository ID and local directory. Optionally specify original model ID and bit width for metadata. Returns repository URL and file count.

Instructions

Push a quantized model to HuggingFace Hub.

Uploads all model files from the output directory to a HuggingFace repository. Generates a model card (README.md) with metadata. Requires HuggingFace authentication (huggingface-cli login or HF_TOKEN).

Args: repo_id: HuggingFace repository ID (e.g. 'username/model-GGUF-4bit'). model_dir: Local directory containing the quantized model files. model: Original model ID for the model card (optional). bits: Bit width used during quantization (for model card metadata).

Returns: Upload result with repository URL and file count.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
repo_idYes
model_dirYes
modelNo
bitsNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The `push` tool handler function. Uses `@mcp.tool()` decorator to register as an MCP tool. Validates the model directory, authenticates with HuggingFace Hub, creates the repository, generates a model card, and uploads all files from the local directory.
    @mcp.tool()
    def push(
        repo_id: str,
        model_dir: str,
        model: str | None = None,
        bits: int = 4,
    ) -> dict[str, Any]:
        """Push a quantized model to HuggingFace Hub.
    
        Uploads all model files from the output directory to a HuggingFace
        repository. Generates a model card (README.md) with metadata.
        Requires HuggingFace authentication (huggingface-cli login or HF_TOKEN).
    
        Args:
            repo_id: HuggingFace repository ID (e.g. 'username/model-GGUF-4bit').
            model_dir: Local directory containing the quantized model files.
            model: Original model ID for the model card (optional).
            bits: Bit width used during quantization (for model card metadata).
    
        Returns:
            Upload result with repository URL and file count.
        """
        if not os.path.isdir(model_dir):
            return {
                "success": False,
                "error": f"Directory does not exist: {model_dir}",
            }
    
        try:
            from huggingface_hub import HfApi
        except ImportError:
            return {
                "success": False,
                "error": "huggingface-hub required. Install: pip install huggingface-hub",
                "install_cmd": "pip install huggingface-hub",
            }
    
        api = HfApi()
    
        # Check authentication
        try:
            user_info = api.whoami()
            username = user_info.get("name", "unknown")
        except Exception:
            return {
                "success": False,
                "error": (
                    "Not authenticated with HuggingFace. "
                    "Run: huggingface-cli login, or set HF_TOKEN environment variable."
                ),
            }
    
        # Create repo if needed
        try:
            api.create_repo(repo_id, exist_ok=True, repo_type="model")
        except Exception as e:
            return {
                "success": False,
                "error": f"Failed to create repository: {e}",
            }
    
        # Generate model card
        model_source = model or "unknown"
        card = _generate_model_card(model_source, repo_id, bits)
        card_path = os.path.join(model_dir, "README.md")
        with open(card_path, "w") as f:
            f.write(card)
    
        # Upload all files
        files_uploaded = 0
        errors = []
        for root, _dirs, files in os.walk(model_dir):
            for fname in files:
                fpath = os.path.join(root, fname)
                rel_path = os.path.relpath(fpath, model_dir)
                try:
                    api.upload_file(
                        path_or_fileobj=fpath,
                        path_in_repo=rel_path,
                        repo_id=repo_id,
                        repo_type="model",
                    )
                    files_uploaded += 1
                except Exception as e:
                    errors.append(f"{rel_path}: {e}")
    
        result = {
            "success": files_uploaded > 0,
            "repository": f"https://huggingface.co/{repo_id}",
            "files_uploaded": files_uploaded,
            "authenticated_as": username,
        }
    
        if errors:
            result["upload_errors"] = errors
    
        return result
  • The `@mcp.tool()` decorator that registers the `push` function as an MCP tool named 'push'.
    @mcp.tool()
  • The `_generate_model_card` helper function used by `push` to create a README.md for the HuggingFace repository with metadata about the base model, quantization bits, and usage instructions.
    def _generate_model_card(model_id: str, hub_repo: str, bits: int) -> str:
        """Generate a HuggingFace model card for uploaded quantized models."""
        return f"""---
    base_model: {model_id}
    tags:
    - quantized
    - turboquant
    - {bits}bit
    license: mit
    ---
    
    # {hub_repo.split('/')[-1]}
    
    **Quantized version of [{model_id}](https://huggingface.co/{model_id})**
    
    Quantized with [TurboQuant MCP](https://github.com/ShipItAndPray/mcp-turboquant) -- compress any LLM via MCP tools.
    
    ## Details
    
    | Property | Value |
    |----------|-------|
    | Base Model | [{model_id}](https://huggingface.co/{model_id}) |
    | Quantization | {bits}-bit |
    
    ## Usage
    
    ### GGUF (Ollama / llama.cpp / LM Studio)
    
    ```bash
    huggingface-cli download {hub_repo} --include "*.gguf"
    ```
    
    ### GPTQ / AWQ (vLLM / TGI)
    
    ```python
    from transformers import AutoModelForCausalLM
    model = AutoModelForCausalLM.from_pretrained("{hub_repo}")
    ```
    
    ---
    *Quantized with [mcp-turboquant](https://github.com/ShipItAndPray/mcp-turboquant)*
    """
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses uploading files, generating a model card, requiring authentication, and returning a result. It does not mention overwrite behavior or potential issues, but covers key behavioral traits adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is mostly concise with clear sections for args and returns, but could be more structured (e.g., bullet points). It is not overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters and an output schema, the description covers purpose, prerequisites, parameter meanings, and return values. It provides sufficient context for an agent to decide and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description fully explains each parameter: repo_id format, model_dir as local directory, model for model card, bits for metadata. This adds value beyond the schema property names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool pushes a quantized model to HuggingFace Hub, uploading model files and generating a model card. It uses specific verbs and resources, and differentiates from sibling tools like check, evaluate, and quantize.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use the tool (after quantization, with required HuggingFace authentication) and provides prerequisites, but does not explicitly state when not to use or compare with alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ShipItAndPray/mcp-turboquant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server