HF MCP Server
Provides tools to search models and datasets, fetch metadata, run inference on text, images and audio, and more from the Hugging Face Hub.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@HF MCP Serverclassify sentiment: I love this movie"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
HF MCP Server
A Model Context Protocol server that gives Claude (and any MCP-compatible client) direct access to the Hugging Face Hub — search models and datasets, fetch metadata, run inference on text, images and audio, all from a single conversation.
There is no official Hugging Face MCP server. This fills that gap.
What you can do
Ask Claude things like:
"Find the top 5 trending text-generation models on Hugging Face"
"Compare gpt2 and distilgpt2 — which has more downloads and likes?"
"What does the README of meta-llama/Llama-2-7b say about usage?"
"Is cardiffnlp/twitter-roberta-base-sentiment-latest ready for inference?"
"Classify the sentiment of: I absolutely loved this film"
"What's in this image?" (with an image URL)
"Transcribe this audio file" (with an audio URL or local path)
Related MCP server: Hugging Face MCP Server
Tools
Tool | Description |
| Search models by query, task, sort criteria |
| Full metadata for a specific model |
| README of a model (usage docs, examples, paper) |
| Side-by-side stats for a list of models |
| Currently trending models, optionally filtered by task |
| Check if a model is warm/cold/loading |
| Run text inference (classification, QA, zero-shot, etc.) |
| Image classification / object detection from URL or file |
| Speech-to-text / audio classification from URL or file |
| Text generation with streaming (requires HF Pro) |
| Search datasets on the Hub |
| Combined metadata + README in one call |
Requirements
Python 3.11+
A Hugging Face account and access token (free tier works for most tools;
generate_textrequires Pro/credits)
Installation
# 1. Clone the repo
git clone https://github.com/YOUR_USERNAME/hf-mcp-server.git
cd hf-mcp-server
# 2. Create and activate a virtual environment
python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set your Hugging Face token
cp .env.example .env
# Edit .env and replace the placeholder with your real tokenConfiguration
Edit .env:
HF_TOKEN=hf_your_token_here
LOG_LEVEL=INFOGet your token at huggingface.co/settings/tokens. A Read token is enough for all tools.
Connect to Claude Desktop
Open your Claude Desktop config file:
Windows:
%APPDATA%\Claude\claude_desktop_config.jsonmacOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Add the mcpServers entry (adjust the path to match your setup):
{
"mcpServers": {
"huggingface": {
"command": "/absolute/path/to/hf-mcp-server/venv/bin/python",
"args": ["/absolute/path/to/hf-mcp-server/main.py"]
}
}
}Windows example:
{
"mcpServers": {
"huggingface": {
"command": "C:\\Users\\YourName\\Projects\\hf-mcp-server\\venv\\Scripts\\python.exe",
"args": ["C:\\Users\\YourName\\Projects\\hf-mcp-server\\main.py"]
}
}
}Restart Claude Desktop. You should see the Hugging Face tools available in the toolbar.
Running the tests
pytest tests/ -vAll tests mock the HF API — no network calls, no token needed.
Architecture
hf-mcp-server/
├── main.py # FastMCP server — 12 tools defined with @mcp.tool()
├── config.py # Environment variables and constants
├── src/
│ └── clients/
│ └── hf_client.py # Async HF API wrapper
│ ├── HFClient # Main client (httpx.AsyncClient)
│ ├── RateLimiter # Sliding-window limiter (async, thread-safe)
│ └── TTLCache # In-memory cache with TTL
└── tests/
├── test_hf_client.py # Unit tests for RateLimiter and TTLCache
└── test_tools.py # Unit tests for all 12 MCP tools (mocked client)Key design decisions:
Async throughout —
httpx.AsyncClient+asyncio, no blockingrequestscalls.Rate limiting — sliding window (not a fixed counter), implemented with
asyncio.Lockso concurrent tool calls don't race each other.TTL cache — all
GETmetadata calls are cached for 1 hour by default. Inference and inference-status calls skip the cache.truststore— uses the OS native certificate store (needed on networks with TLS inspection/corporate proxies).Error handling — every tool catches exceptions and returns
{"status": "error", "error": "..."}instead of crashing the MCP connection.
Notes
generate_textuses Server-Sent Events streaming internally and returns the complete text when done. It requires a HF Pro account or inference credits — most text-generation models are not available on the free tier.run_image_inferenceandrun_audio_inferenceaccept both remote URLs and absolute local file paths.The HF Inference API routes requests through
router.huggingface.co/hf-inference. Not all models are available on all providers — if you get a "Model not supported by provider" error, try a different model or check HF Inference docs.
License
MIT — see LICENSE.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/javica98/hf-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server