Skip to main content
Glama

mcp-ollama

Glama score

MCP server wrapping local Ollama models for offload from API-priced orchestrators.

License: Apache-2.0 Node MCP

Exposes nine tools that pass work to a local model (text generation, summarisation, code tasks, mechanical transforms, commit/PR/changelog drafting). The orchestrator decides what to route locally; this server does the routing.

  • Transport: stdio

  • Runtime: Node 18+

  • Default model: hermes3:8b (override via OLLAMA_MODEL)

  • Ollama host: http://localhost:11434 (override via OLLAMA_HOST)

  • Ships no model weights, no cloud call-outs, no telemetry. Every request stays on the host where Ollama is running.

  • License: Apache-2.0

Why

Orchestrators priced by the token (Claude Code, Cursor, the Anthropic API, Cline, Aider) pay for every classification, every docstring, every commit message. Most of that work doesn't need a frontier model. Routed to Ollama on the same machine, the same work is free and faster. mcp-ollama is the routing surface.

The orchestrating model decides what to route where. This server is plumbing — it does not try to be clever about task classification. Pick the right tool, pass the text, get a result back.

Install

From source

git clone https://github.com/true-alter/mcp-ollama.git
cd mcp-ollama
npm install
npm run build

You also need a running Ollama instance with at least one model pulled:

# Default — 8B, fast, good for classifications and short generations
ollama pull hermes3:8b

# Optional — code-specialised, heavier, better for local_code tasks
ollama pull qwen2.5-coder:32b

Docker

docker build -t mcp-ollama .
docker run -i --rm \
  -e OLLAMA_HOST=http://host.docker.internal:11434 \
  -e OLLAMA_MODEL=hermes3:8b \
  mcp-ollama

The supplied Dockerfile points at host.docker.internal:11434 so the container reaches Ollama on the host.

Run (stdio)

node dist/index.js

Stdio servers are launched by the MCP client (Claude Code, Cursor, etc.) — running it directly is only useful for debugging.

Configure Claude Code

claude mcp add --transport stdio ollama -- node /absolute/path/to/mcp-ollama/dist/index.js

Or in ~/.claude/settings.json:

{
  "mcpServers": {
    "ollama": {
      "transport": "stdio",
      "command": "node",
      "args": ["/absolute/path/to/mcp-ollama/dist/index.js"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "OLLAMA_MODEL": "hermes3:8b"
      }
    }
  }
}

Tools

Tool

Purpose

local_generate

General-purpose generation with system + user prompt

local_summarize

Summarise a blob of text

local_analyze

Analyse text against a specific question

local_draft

Draft content in a given style

local_code

Code tasks: docstring / test / explain / review / types / refactor-suggest

local_diff

Diff-driven tasks: commit-message / pr-description / changelog / summary / impact

local_transform

Mechanical code transformations

local_models

List models available on the local Ollama host

local_pull

Pull a model onto the local Ollama host

Full tool schemas are exposed over MCP introspection — any MCP-aware client will enumerate them automatically.

Environment variables

Variable

Default

Purpose

OLLAMA_HOST

http://localhost:11434

Ollama HTTP endpoint

OLLAMA_MODEL

hermes3:8b

Default model when a tool call omits model

Any tool call may override model explicitly — the env default only applies when unset. local_code tends to work better with a code-specialised model passed per-call, while local_summarize and local_draft are fine on the default.

Model selection guidance

Workload

Recommended model

Rationale

Classification, one-liners, tags

hermes3:8b

Fastest round-trip, cheap to run

Commit messages, changelogs, summaries

qwen2.5-14b-instruct

Higher quality, still comfortable on 16GB GPU

Code review, docstrings, tests

qwen2.5-coder:32b

Code-specialised

Fallback / unknown model

whatever local_models returns

Inspect first, then route

Use local_models at session start if you're unsure what's available on a host.

Troubleshooting

Ollama error 404 when calling a tool. The model isn't pulled. Run ollama pull <name> or call local_pull from the client.

fetch failed / connection refused. Ollama isn't running, or OLLAMA_HOST points somewhere wrong. Verify with curl $OLLAMA_HOST/api/tags. Inside a container, localhost is the container itself — use host.docker.internal on macOS/Windows or a bridge IP on Linux.

Tool calls feel slow. First call to a cold model incurs a load. Subsequent calls within the same Ollama process are much faster. If the model is larger than available VRAM, Ollama falls back to CPU — watch ollama ps to confirm.

Empty or truncated output. max_tokens defaults to 2048 per tool. For long generations, pass max_tokens explicitly in the tool call.

Security posture

mcp-ollama makes no network call of its own beyond the configured OLLAMA_HOST. It ships no telemetry, no analytics, no auto-update pinger. Tool inputs are forwarded to Ollama's HTTP API verbatim and the response is relayed back; the server itself is stateless between calls.

If you run Ollama on localhost (the default) the entire loop stays on the host. If you point OLLAMA_HOST at a remote endpoint, treat that endpoint's security posture as authoritative — a typo sending prompts to a third-party host is trivially possible.

To report a security issue, see SECURITY.md.

Contributing

Bug reports and small patches welcome — see CONTRIBUTING.md. Larger design changes: please open an issue first so we can talk about scope before you invest time.

Part of ALTER

mcp-ollama is maintained by ALTER as part of the identity infrastructure for the AI economy. The ALTER identity MCP server is hosted at mcp.truealter.com — see @truealter/sdk for the TypeScript client.

License

Apache License 2.0. See LICENSE for the full text. Copyright 2026 Alter Meridian Pty Ltd (ABN 54 696 662 049).

A
license - permissive license
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/true-alter/mcp-ollama'

If you have feedback or need assistance with the MCP directory API, please join our Discord server