Schema | Ollama-Omega

Ollama-Omega

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`PYTHONUTF8`	No	Set to '1' for Windows Unicode safety
`OLLAMA_HOST`	No	Ollama daemon URL	http://localhost:11434
`OLLAMA_TIMEOUT`	No	Request timeout in seconds (long for large model pulls/cloud inference)	300

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
ollama_health	Check Ollama daemon connectivity and list currently running models. Use this tool as the first call to verify the Ollama service is reachable before calling any other tool in this server. Do not use this to list all installed models — use ollama_list_models instead. Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Makes a single HTTP GET to the Ollama daemon. On connection failure returns an error object without throwing.
ollama_list_models	List all Ollama models installed on the local machine with their memory load status. Use this tool to discover available model names before calling ollama_chat, ollama_generate, or ollama_show_model. Do not use this to check if the Ollama daemon is running — use ollama_health instead. Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Returns an empty models array if no models are installed.
ollama_chat	Send a multi-turn chat completion request to an Ollama model. Use this tool for conversational interactions where message history matters — for example, follow-up questions, multi-step reasoning, or dialogue with context. Do not use this for single-prompt completions without history; use ollama_generate instead to avoid the overhead of the messages array. Prerequisites: The 'model' must already be installed locally. Call ollama_list_models to verify availability; use ollama_pull_model to download if missing. Behavior: Read-only (no state changes on the server), not idempotent — each call generates a new response even with identical inputs. No authentication required. No rate limits. Network-dependent; response time varies from seconds to minutes based on model size and prompt length. Safe to retry on timeout. On model-not-found error, returns an error object without throwing.
ollama_generate	Generate a single-turn text completion from an Ollama model without conversation history. Use this tool for one-shot tasks: code generation, text transformation, summarization, translation, or any prompt that does not require prior context. Do not use this for multi-turn conversations where message history matters; use ollama_chat instead. Prerequisites: The 'model' must already be installed. Call ollama_list_models to verify; use ollama_pull_model to download if missing. Behavior: Read-only, not idempotent — each call produces a different generation even with identical inputs. No authentication required. No rate limits. Network-dependent; response time varies with model size and prompt length. Safe to retry on timeout. On model-not-found error, returns an error object without throwing.
ollama_show_model	Retrieve detailed metadata about a specific installed Ollama model. Use this tool to inspect a model's architecture, license, quantization level, prompt template, and default parameters before using it with ollama_chat or ollama_generate. Do not use this to list all models — use ollama_list_models instead. Do not use this to download new models — use ollama_pull_model instead. Prerequisites: The model must already be installed locally (verify with ollama_list_models). Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Returns the same metadata for the same model every time. On model-not-found error, returns an error object without throwing.
ollama_pull_model	Download a model from the Ollama library to the local machine. Use this tool when a model is needed but not yet installed locally. Do not use this if the model is already available — call ollama_list_models first to check. Do not use this to run inference — use ollama_chat or ollama_generate after pulling. Behavior: WRITE operation — downloads large files (1–100+ GB) and stores them on disk. Idempotent — re-pulling an already-installed model is safe and verifies integrity. No authentication required. No rate limits. Execution time ranges from seconds to hours depending on model size and network bandwidth. Not destructive (does not delete existing data). On network failure, returns an error object without throwing.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/VrtxOmega/Ollama-Omega'

If you have feedback or need assistance with the MCP directory API, please join our Discord server