Ollama-Omega
Ollama-Omega is a hardened MCP server that bridges the full Ollama ecosystem, letting you interact with local and cloud-hosted AI models from any MCP-compatible IDE through six validated tools:
Check server health (
ollama_health): Verify connectivity to the Ollama daemon and see which models are currently loaded in memory.List available models (
ollama_list_models): Retrieve all models with details like size, loaded status, and modification date.Chat with a model (
ollama_chat): Send multi-turn chat completion requests with message history, optional system prompts, and configurable parameters like temperature and max tokens.Generate text (
ollama_generate): Generate a response from a single prompt (no chat history), with optional system prompt and sampling controls.Inspect a model (
ollama_show_model): View detailed information about a specific model, including its license, parameters, and configuration.Download a model (
ollama_pull_model): Pull any model from the Ollama library directly through the MCP interface, with support for large cloud models via extended timeouts.
All operations are secured with SSRF protection, input validation, error sanitization, and structured logging.
Bridges the full Ollama ecosystem into MCP-compatible IDEs, providing tools for health checks, listing available models, chat completions, text generation, model information retrieval, and model downloads from the Ollama library.
A hardened MCP server that bridges the full Ollama ecosystem — local models and cloud-hosted behemoths alike — into any MCP-compatible IDE. No wrapper scripts. No bloated SDK. Just a single Python file with two dependencies.
DESIGN PRINCIPLE: Ollama-Omega does not abstract away Ollama. It exposes the complete Ollama API surface through 6 validated, error-handled MCP tools with zero information loss.
Architecture
┌─────────────────────────────────────────────────────┐
│ MCP Client (IDE) │
│ Claude Desktop / Antigravity / etc. │
└──────────────────────┬──────────────────────────────┘
│ stdio (JSON-RPC 2.0)
┌──────────────────────▼──────────────────────────────┐
│ ollama_mcp_server.py │
│ ┌──────────┐ ┌──────────┐ ┌───────────────────┐ │
│ │ Validator│ │ Dispatch │ │ Singleton httpx │ │
│ │ + Schema │→│ Router │→│ AsyncClient │ │
│ └──────────┘ └──────────┘ │ (no redirects) │ │
│ └─────────┬─────────┘ │
└───────────────────────────────────────┼──────────────┘
│ HTTP
┌───────────────────────────────────────▼──────────────┐
│ Ollama Daemon │
│ Local models (GPU) │ Cloud models (API proxy) │
└───────────────────────────────────────────────────────┘Tools (6)
Tool | Purpose |
| Check connectivity and list currently running/loaded models |
| List all available models with size, loaded status, and modification date |
| Send a chat completion request with message history and system prompt |
| Generate a response for a given prompt without chat history |
| Show detailed information about a specific model (license, parameters) |
| Download a model from the Ollama library |
Hardening Audit
# | Category | Mitigation |
1 | SSRF | Redirects disabled on httpx client ( |
2 | Resource Leak | Singleton |
3 | Input Validation |
|
4 | JSON Safety |
|
5 | Structured Logging | All stderr output via |
6 | DRY Payloads |
|
7 | Error Sanitization |
|
Quick Start
Requirements
Python 3.11+
pip install mcp httpx
Configure in Claude Desktop / Antigravity
{
"mcpServers": {
"ollama": {
"command": "uv",
"args": [
"--directory",
"path/to/ollama-mcp",
"run",
"python",
"ollama_mcp_server.py"
],
"env": {
"PYTHONUTF8": "1",
"OLLAMA_HOST": "http://localhost:11434",
"OLLAMA_TIMEOUT": "300"
}
}
}
}Environment Variables
Variable | Default | Description |
|
| Ollama daemon URL |
|
| Request timeout in seconds (long for large model pulls/cloud inference) |
| — | Set to |
Cloud Models
Ollama-Omega is version-agnostic. If your Ollama daemon exposes cloud-hosted models (e.g., qwen3.5:397b-cloud via API proxy), they are accessible through the same 6 tools — no configuration change required.
File Structure
Ollama-Omega/
ollama_mcp_server.py # MCP server (~307 lines) — hardened, single-file
pyproject.toml # Package metadata, CLI entry, PyPI classifiers
requirements.txt # mcp>=1.0.0, httpx>=0.27.0
glama.json # Glama MCP directory registration
LICENSE # MIT
CHANGELOG.md # Version history
tests/
test_server.py # 48 tests — tools, dispatch, errors, SSRF, config
examples/
basic_usage.py # Programmatic MCP client example
docs/
BUILD_SPEC.md # Internal build specificationTesting
pip install pytest
python -m pytest tests/ -v48 tests covering:
Tool Definitions — schema validation, required fields, descriptions
Helper Functions — options builder, validation, JSON safety, error formatting
Dispatcher — all 6 tool paths with mocked HTTP responses
Error Handling — connection, timeout, HTTP status, exception sanitization
Configuration — environment defaults, SSRF mitigation, server identity
Companion Server
Ollama-Omega is the transport layer for the Omega Brain MCP — cross-session episodic memory + 10-gate VERITAS Build pipeline. Together they form the sovereign intelligence stack.
License
MIT
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/VrtxOmega/Ollama-Omega'
If you have feedback or need assistance with the MCP directory API, please join our Discord server