quelllm-mcp
quelllm-mcp
MCP server exposing the quelllm.fr catalog of 190+ open-weights LLMs via Model Context Protocol tools. Use it from Claude Code, Cursor, Continue, or any MCP-compatible client to query models, compare them, estimate VRAM, and compute API vs self-hosted cost.
Tools exposed
Tool | Description |
| List models with filters (origin code, family, max params in B) |
| Full record for one model (params, vram per quant, context window, family, tags, license, URLs) |
| Side-by-side comparison with verdict |
| VRAM in GB at chosen quant + recommended GPU/Mac tiers |
| Cost in EUR — full table API providers vs self-hosted hardware OR a specific id |
| Fuzzy search by name, family, tag, author |
Related MCP server: HydraMCP
Install
Install from source (not yet on PyPI) :
pip install git+https://github.com/MGM-FALCON/quelllm-mcp.gitOr run without installing, using uv :
uvx --from git+https://github.com/MGM-FALCON/quelllm-mcp.git quelllm-mcpFor local development :
git clone https://github.com/MGM-FALCON/quelllm-mcp.git
cd quelllm-mcp
pip install -e .Use with Claude Code
Add to ~/.claude.json or a project's .mcp.json. If you installed with pip :
{
"mcpServers": {
"quelllm": {
"command": "quelllm-mcp"
}
}
}Or zero-install with uvx :
{
"mcpServers": {
"quelllm": {
"command": "uvx",
"args": ["--from", "git+https://github.com/MGM-FALCON/quelllm-mcp.git", "quelllm-mcp"]
}
}
}Use with Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) :
{
"mcpServers": {
"quelllm": {
"command": "quelllm-mcp"
}
}
}Use with Cursor / Continue / Cline
Most MCP clients accept the same JSON config :
{
"command": "quelllm-mcp"
}Example queries (from your client)
> Quels LLM Mistral peuvent tourner sur RTX 5070 Ti 16GB ?
→ list_models(filter_family='Mistral', max_params_b=24)
→ estimate_vram('mistral-small-24b', 'q4')
> Compare Llama 3.3 70B vs Qwen 2.5 32B
→ compare('llama33-70b', 'qwen25-32b')
> J'utilise 10M tokens input + 2.5M output / mois. Combien je paye chez OpenAI vs DeepSeek ?
→ estimate_cost(10_000_000, 2_500_000)Data source
All data pulled from quelllm.fr/api/ (CC BY 4.0, no key, CORS-enabled). Cached locally for 1h to avoid rate-limiting.
API pricing data (GPT-5, Claude Opus 4.7, Gemini 2.5, DeepSeek, Mistral) and hardware pricing (RTX 50-series, Mac M4) are hardcoded as of 2026-05 — verify semestrially.
License
MIT — see LICENSE.
Contributing
Source : https://github.com/MGM-FALCON/quelllm-mcp Issues + PRs welcome. Particularly :
API pricing updates (semestrial)
Hardware additions (new GPUs, Mac Mx series)
New tools (e.g.
find_alternatives_to(model_id),recommend_gpu(budget_eur))
Tests
A pytest smoke suite lives under tests/. It covers all 6 tools and the v1.1.0
output invariants, never touches the network (local fixture + mocked httpx),
and stubs the mcp SDK when it isn't importable — so it also runs on Python 3.9.
pip install -e ".[test]"
pytestAuthor
Mohamed Meguedmi — LinkedIn · Hugging Face Founder of La Gazette IA and QuelLLM.fr.
This server cannot be installed
Maintenance
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/MGM-FALCON/quelllm-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server