ollama-code-mcp
Allows delegation of coding tasks such as code generation, review, refactoring, test writing, explanation, and batch processing to a local or LAN Ollama instance running a Qwen3 model.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@ollama-code-mcpgenerate a Python function to download a file from a URL"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ollama-code-mcp
An MCP server that lets Claude Code delegate
coding tasks to a local (or LAN) Ollama instance running
a Qwen3 model. Point it at a GPU box on your network -- a Tesla V100 running
qwen3:32b, for example -- and Claude Code can hand off boilerplate
generation, test writing, diff review, and batch refactors to it instead of
spending cloud tokens and context window on them.
See CLAUDE.md for the routing guidance Claude Code reads to
decide what to delegate versus what to keep in the cloud.
Why
Saves Claude's context window. The file-aware tool variants take a
file_path(ordiff_file, or aglob_patternfor batches) and read the content server-side, so Claude never has to paste large files into a tool call just to hand them off.Uses idle GPU capacity. If you already run Ollama on a home GPU box or in a k3s cluster, this turns that capacity into a first-class Claude Code tool instead of a chat window you have to copy-paste into.
Fails safe. If Ollama is unreachable, times out, or the model isn't pulled, tools return a clear, non-fatal message telling Claude to just handle the task itself rather than getting stuck retrying.
Related MCP server: codex-dobby-mcp
Tools
Tool | Purpose | Inputs |
| Generate new code from an instruction |
|
| Review code for bugs, security issues, simplifications |
|
| Refactor code per an instruction, behavior-preserving |
|
| Diagnose and fix a bug |
|
| Write tests for given code |
|
| Explain what code does |
|
| Review a git diff, PR-review style |
|
| Apply an instruction across files matching a glob, sequentially |
|
| Health check: reachability, configured model, available models | -- |
Every tool (except ollama_status) accepts a think: bool parameter. This
toggles Qwen3's extended reasoning by appending /think or /no_think to
the prompt (see Think mode below). All coding tools default to
think=True; explain_code defaults to False since explanations are
usually fast enough without it.
code / file_path (and diff / diff_file) pairs are mutually exclusive
-- pass exactly one. File paths are resolved and confined to
OLLAMA_MCP_ALLOWED_DIR (see Configuration); attempts to
read or write outside it are rejected.
refactor_code returns the refactored code as text -- it does not touch
disk. batch_refactor is the one tool that can write files, and only when
called with dry_run=False; by default it returns a unified diff per file
so you can review before applying.
Installation
Requires Python 3.10+.
git clone git@github.com:darthzen/ollama-code-mcp.git
cd ollama-code-mcp
pip install -e .Or with uv:
uv pip install -e .Register with Claude Code
Add to your Claude Code MCP config (claude mcp add or the mcpServers
block in your settings), pointing OLLAMA_BASE_URL at wherever Ollama
actually listens -- commonly a LAN address, not localhost, since the model
runs on a dedicated GPU host:
{
"mcpServers": {
"ollama-code": {
"command": "ollama-code-mcp",
"env": {
"OLLAMA_BASE_URL": "http://192.168.1.50:11434",
"OLLAMA_MODEL": "qwen3:32b",
"OLLAMA_MCP_ALLOWED_DIR": "/Users/you/code"
}
}
}
}Or run it straight from the repo without installing:
{
"mcpServers": {
"ollama-code": {
"command": "/path/to/ollama-code-mcp/.venv/bin/python",
"args": ["-m", "ollama_code_mcp.server"],
"env": { "OLLAMA_BASE_URL": "http://192.168.1.50:11434" }
}
}
}OLLAMA_MCP_ALLOWED_DIR should be set to the project root (or a parent of
every project) you want file-aware tools to be able to read/write. It
defaults to the server's current working directory.
Configuration
All configuration is via environment variables:
Variable | Default | Description |
|
| Where Ollama listens. LAN addresses and bare |
|
| Model tag to use, as shown by |
|
| Request timeout in seconds. Defaults to 15 minutes to allow large generations/refactors on modest hardware. |
|
| TCP connect timeout in seconds. |
|
| Context window passed to Ollama's |
| server CWD | Base directory that file-aware tools are confined to. |
|
| Per-file size cap for server-side reads. |
|
| Max files processed per |
|
| Default value for each tool's |
|
|
|
|
| Bind host for |
|
| Bind port for |
Think mode
Qwen3 exposes a soft toggle for its extended chain-of-thought reasoning:
appending /think or /no_think to the end of a prompt turns it on or off
for that turn. This server does that automatically based on each tool's
think parameter, and separates the model's <think>...</think> block from
its final answer in the response, so you get:
[review_code] via qwen3:32b (4213 ms, 812 tokens)
<the actual review>
--- model reasoning ---
<the model's chain of thought, if think=True>Use think=True (the default for most tools) for review, refactor, fix, and
test-writing tasks where reasoning quality matters. Use think=False for
quick, low-stakes generations or explanations where latency matters more.
Running standalone
OLLAMA_BASE_URL=http://192.168.1.50:11434 ollama-code-mcpBy default this speaks MCP over stdio, which is what Claude Code expects
when it spawns the process itself. To run it as a long-lived network
service instead (for the Docker/k8s deployment below), set
MCP_TRANSPORT=streamable-http.
Docker
docker build -t ollama-code-mcp .
docker run --rm -p 8765:8765 \
-e OLLAMA_BASE_URL=http://192.168.1.50:11434 \
-e MCP_TRANSPORT=streamable-http \
-v /path/to/your/code:/workspace \
-e OLLAMA_MCP_ALLOWED_DIR=/workspace \
ollama-code-mcpClaude Code would then connect to it as a remote MCP server at
http://<host>:8765/mcp.
Kubernetes / k3s
Manifests are in k8s/. They assume Ollama is already running in
the same cluster (e.g. via the ollama-helm chart with a LoadBalancer
service on port 11434, as in ollama-current-values.yaml), and reach it over
the cluster-internal service DNS name rather than a LAN IP:
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yamlEdit k8s/configmap.yaml to point OLLAMA_BASE_URL at your Ollama
service (e.g. http://ollama.ollama.svc.cluster.local:11434) and adjust the
image reference in k8s/deployment.yaml to wherever you push the built
image. The manifests expose the server via streamable-http on a
ClusterIP service; use a LAN-reachable LoadBalancer or port-forward if
Claude Code runs outside the cluster.
Security notes
File-aware tools are confined to
OLLAMA_MCP_ALLOWED_DIRvia path resolution + containment checks -- paths that resolve outside it (e.g.../../etc/passwd) are rejected.batch_refactorwrites are opt-in (dry_run=False) and capped in count (OLLAMA_MCP_MAX_BATCH_FILES) and per-file size (OLLAMA_MCP_MAX_FILE_BYTES).This server has no authentication of its own. If you deploy it with a network transport (
sse/streamable-http), keep it on a trusted LAN or put it behind a network policy / VPN -- do not expose it to the public internet.
Development
pip install -e ".[dev]"
pytestTests mock all Ollama HTTP calls (via respx) and use tmp_path for file
operations, so they run offline and don't need a real Ollama instance.
License
MIT -- see LICENSE.
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/darthzen/ollama-code-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server