push
Upload quantized model files and model card to HuggingFace Hub. Provide repository ID and local directory. Optionally specify original model ID and bit width for metadata. Returns repository URL and file count.
Instructions
Push a quantized model to HuggingFace Hub.
Uploads all model files from the output directory to a HuggingFace repository. Generates a model card (README.md) with metadata. Requires HuggingFace authentication (huggingface-cli login or HF_TOKEN).
Args: repo_id: HuggingFace repository ID (e.g. 'username/model-GGUF-4bit'). model_dir: Local directory containing the quantized model files. model: Original model ID for the model card (optional). bits: Bit width used during quantization (for model card metadata).
Returns: Upload result with repository URL and file count.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| repo_id | Yes | ||
| model_dir | Yes | ||
| model | No | ||
| bits | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- mcp_turboquant/server.py:322-418 (handler)The `push` tool handler function. Uses `@mcp.tool()` decorator to register as an MCP tool. Validates the model directory, authenticates with HuggingFace Hub, creates the repository, generates a model card, and uploads all files from the local directory.
@mcp.tool() def push( repo_id: str, model_dir: str, model: str | None = None, bits: int = 4, ) -> dict[str, Any]: """Push a quantized model to HuggingFace Hub. Uploads all model files from the output directory to a HuggingFace repository. Generates a model card (README.md) with metadata. Requires HuggingFace authentication (huggingface-cli login or HF_TOKEN). Args: repo_id: HuggingFace repository ID (e.g. 'username/model-GGUF-4bit'). model_dir: Local directory containing the quantized model files. model: Original model ID for the model card (optional). bits: Bit width used during quantization (for model card metadata). Returns: Upload result with repository URL and file count. """ if not os.path.isdir(model_dir): return { "success": False, "error": f"Directory does not exist: {model_dir}", } try: from huggingface_hub import HfApi except ImportError: return { "success": False, "error": "huggingface-hub required. Install: pip install huggingface-hub", "install_cmd": "pip install huggingface-hub", } api = HfApi() # Check authentication try: user_info = api.whoami() username = user_info.get("name", "unknown") except Exception: return { "success": False, "error": ( "Not authenticated with HuggingFace. " "Run: huggingface-cli login, or set HF_TOKEN environment variable." ), } # Create repo if needed try: api.create_repo(repo_id, exist_ok=True, repo_type="model") except Exception as e: return { "success": False, "error": f"Failed to create repository: {e}", } # Generate model card model_source = model or "unknown" card = _generate_model_card(model_source, repo_id, bits) card_path = os.path.join(model_dir, "README.md") with open(card_path, "w") as f: f.write(card) # Upload all files files_uploaded = 0 errors = [] for root, _dirs, files in os.walk(model_dir): for fname in files: fpath = os.path.join(root, fname) rel_path = os.path.relpath(fpath, model_dir) try: api.upload_file( path_or_fileobj=fpath, path_in_repo=rel_path, repo_id=repo_id, repo_type="model", ) files_uploaded += 1 except Exception as e: errors.append(f"{rel_path}: {e}") result = { "success": files_uploaded > 0, "repository": f"https://huggingface.co/{repo_id}", "files_uploaded": files_uploaded, "authenticated_as": username, } if errors: result["upload_errors"] = errors return result - mcp_turboquant/server.py:322-322 (registration)The `@mcp.tool()` decorator that registers the `push` function as an MCP tool named 'push'.
@mcp.tool() - mcp_turboquant/server.py:421-462 (helper)The `_generate_model_card` helper function used by `push` to create a README.md for the HuggingFace repository with metadata about the base model, quantization bits, and usage instructions.
def _generate_model_card(model_id: str, hub_repo: str, bits: int) -> str: """Generate a HuggingFace model card for uploaded quantized models.""" return f"""--- base_model: {model_id} tags: - quantized - turboquant - {bits}bit license: mit --- # {hub_repo.split('/')[-1]} **Quantized version of [{model_id}](https://huggingface.co/{model_id})** Quantized with [TurboQuant MCP](https://github.com/ShipItAndPray/mcp-turboquant) -- compress any LLM via MCP tools. ## Details | Property | Value | |----------|-------| | Base Model | [{model_id}](https://huggingface.co/{model_id}) | | Quantization | {bits}-bit | ## Usage ### GGUF (Ollama / llama.cpp / LM Studio) ```bash huggingface-cli download {hub_repo} --include "*.gguf" ``` ### GPTQ / AWQ (vLLM / TGI) ```python from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("{hub_repo}") ``` --- *Quantized with [mcp-turboquant](https://github.com/ShipItAndPray/mcp-turboquant)* """