Schema | mcp-turboquant

mcp-turboquant

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
No arguments

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
infoA	Get model info from HuggingFace — parameters, size, architecture. Lightweight call using the HuggingFace API. No GPU or heavy dependencies required. Args: model: HuggingFace model ID (e.g. 'meta-llama/Llama-3.1-8B-Instruct') or local path to a model directory. Returns: Model metadata including architecture, parameter count, size, hidden dimensions, number of layers, vocabulary size, and context length.
checkA	Check available quantization backends on this system. Reports which quantization engines (GGUF/GPTQ/AWQ) are installed, whether PyTorch and transformers are available, GPU information (CUDA or Apple MPS), and system RAM. No arguments required. Lightweight system check. Returns: Dictionary of available backends and hardware info.
recommendA	Recommend best quantization format and bit width for a model. Analyzes the model size and your hardware (GPU VRAM, Apple Silicon, system RAM) to suggest the optimal format (GGUF/GPTQ/AWQ) and bit width (2-8). Ranked recommendations with use-case explanations. Args: model: HuggingFace model ID (e.g. 'meta-llama/Llama-3.1-8B-Instruct') or local path to a model directory. Returns: Ranked recommendations with format, bits, reasoning, and use cases.
quantizeA	Quantize a HuggingFace model to GGUF, GPTQ, or AWQ format. This is a heavy operation that downloads and compresses the model. Requires appropriate backend dependencies to be installed. Args: model: HuggingFace model ID (e.g. 'meta-llama/Llama-3.1-8B-Instruct') or local path to a model directory. format: Output format — gguf, gptq, or awq. Default: gguf. bits: Quantization bit width — 2, 3, 4, 5, or 8. Default: 4. output_dir: Directory to write output files. Default: temp directory. target: Deployment target. ollama/llamacpp/lmstudio force GGUF, vllm forces AWQ. Returns: Quantization result with file paths, sizes, and compression ratios.
evaluateA	Run perplexity evaluation on a quantized model. Measures model quality after quantization using perplexity scoring. Lower perplexity = better quality. Includes a quality assessment (EXCELLENT/GOOD/FAIR/DEGRADED/POOR). Args: model_path: Path to the quantized model file (GGUF) or directory (GPTQ/AWQ). format: Format of the quantized model. One of 'gguf', 'gptq', 'awq'. bits: Bit width used during quantization (for quality context). Returns: Perplexity score, quality assessment, and evaluation metadata.
pushA	Push a quantized model to HuggingFace Hub. Uploads all model files from the output directory to a HuggingFace repository. Generates a model card (README.md) with metadata. Requires HuggingFace authentication (huggingface-cli login or HF_TOKEN). Args: repo_id: HuggingFace repository ID (e.g. 'username/model-GGUF-4bit'). model_dir: Local directory containing the quantized model files. model: Original model ID for the model card (optional). bits: Bit width used during quantization (for model card metadata). Returns: Upload result with repository URL and file count.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ShipItAndPray/mcp-turboquant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server