info
Retrieve model metadata from HuggingFace including architecture, parameters, size, and specifications for quantization planning.
Instructions
Get model info from HuggingFace — parameters, size, architecture.
Lightweight call using the HuggingFace API. No GPU or heavy dependencies required.
Args: model: HuggingFace model ID (e.g. 'meta-llama/Llama-3.1-8B-Instruct') or local path to a model directory.
Returns: Model metadata including architecture, parameter count, size, hidden dimensions, number of layers, vocabulary size, and context length.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| model | Yes |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- mcp_turboquant/server.py:37-87 (handler)The "info" tool is registered here and defines the handler that calls `get_model_info` and formats the returned dictionary.
@mcp.tool() def info(model: str) -> dict[str, Any]: """Get model info from HuggingFace — parameters, size, architecture. Lightweight call using the HuggingFace API. No GPU or heavy dependencies required. Args: model: HuggingFace model ID (e.g. 'meta-llama/Llama-3.1-8B-Instruct') or local path to a model directory. Returns: Model metadata including architecture, parameter count, size, hidden dimensions, number of layers, vocabulary size, and context length. """ result = get_model_info(model) if not result.get("found"): return { "error": f"Model not found: {result.get('error', 'unknown')}", "model": model, } # Build a clean response (strip internal fields like raw config) output = { "model": result.get("model_id", result.get("source")), "found": True, "architecture": result.get("arch", "unknown"), "parameters": result.get("params_human", "unknown"), "size": result.get("size_human", "unknown"), "size_bytes": result.get("size_bytes", 0), "hidden_size": result.get("hidden_size", 0), "num_layers": result.get("num_layers", 0), "vocab_size": result.get("vocab_size", 0), "context_length": result.get("context_length", 0), } if result.get("local"): output["local"] = True # Add compression estimates if result.get("size_bytes"): sz = result["size_bytes"] output["estimated_sizes"] = { "4bit": format_size(sz / estimate_compression(16, 4)), "8bit": format_size(sz / estimate_compression(16, 8)), } return output