Exposes local Ollama instances to provide capabilities for single-turn text generation, multi-turn chat conversations, embedding vector generation, and management of local models.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Ollama MCP Servergenerate a unit test for this function using llama3"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ollama-mcp
An MCP (Model Context Protocol) server that exposes local Ollama instances as tools for Claude Code.
Lets Claude offload code generation, drafts, embeddings, and quick questions to your local GPUs.
Setup
Run the setup script:
bash setup.shThis creates a venv, installs dependencies, generates a machine-specific
config.json, and registers the MCP server with Claude Code.Note:
setup.shusescygpathand targets Windows (Git Bash / MSYS2). On Linux/macOS, replace thecygpath -wcalls with the paths directly, or register manually:claude mcp add ollama -s user -- /path/to/.venv/bin/python /path/to/src/ollama_mcp/server.pyRestart Claude Code.
Tools
Tool | Description |
| Single-turn prompt → response |
| Multi-turn conversation |
| Generate embedding vectors |
| List models on your Ollama instances |
Configuration
Copy config.example.json to config.json and fill in your machine details, or let setup.sh generate it interactively.
Requirements
Python 3.10+
Ollama 0.4.0+ running on at least one machine
Claude Code with MCP support
Development
Troubleshooting
Problem | Cause | Fix |
| Setup not run | Run |
404 on embed calls | Ollama < 0.4.0 | Upgrade Ollama ( |
| Ollama not running on target host | Start Ollama: |
| Large model / slow hardware | Increase |
| Host unreachable | Check network, firewall, Ollama port 11434 |
| Running setup.sh on Linux/macOS | See setup note above |