Ollama-Omega
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| PYTHONUTF8 | No | Set to '1' for Windows Unicode safety | |
| OLLAMA_HOST | No | Ollama daemon URL | http://localhost:11434 |
| OLLAMA_TIMEOUT | No | Request timeout in seconds (long for large model pulls/cloud inference) | 300 |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| ollama_healthA | Check Ollama daemon connectivity and list currently running models. Use this tool as the first call to verify the Ollama service is reachable before calling any other tool in this server. Do not use this to list all installed models — use ollama_list_models instead. Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Makes a single HTTP GET to the Ollama daemon. On connection failure returns an error object without throwing. |
| ollama_list_modelsA | List all Ollama models installed on the local machine with their memory load status. Use this tool to discover available model names before calling ollama_chat, ollama_generate, or ollama_show_model. Do not use this to check if the Ollama daemon is running — use ollama_health instead. Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Returns an empty models array if no models are installed. |
| ollama_chatA | Send a multi-turn chat completion request to an Ollama model. Use this tool for conversational interactions where message history matters — for example, follow-up questions, multi-step reasoning, or dialogue with context. Do not use this for single-prompt completions without history; use ollama_generate instead to avoid the overhead of the messages array. Prerequisites: The 'model' must already be installed locally. Call ollama_list_models to verify availability; use ollama_pull_model to download if missing. Behavior: Read-only (no state changes on the server), not idempotent — each call generates a new response even with identical inputs. No authentication required. No rate limits. Network-dependent; response time varies from seconds to minutes based on model size and prompt length. Safe to retry on timeout. On model-not-found error, returns an error object without throwing. |
| ollama_generateA | Generate a single-turn text completion from an Ollama model without conversation history. Use this tool for one-shot tasks: code generation, text transformation, summarization, translation, or any prompt that does not require prior context. Do not use this for multi-turn conversations where message history matters; use ollama_chat instead. Prerequisites: The 'model' must already be installed. Call ollama_list_models to verify; use ollama_pull_model to download if missing. Behavior: Read-only, not idempotent — each call produces a different generation even with identical inputs. No authentication required. No rate limits. Network-dependent; response time varies with model size and prompt length. Safe to retry on timeout. On model-not-found error, returns an error object without throwing. |
| ollama_show_modelA | Retrieve detailed metadata about a specific installed Ollama model. Use this tool to inspect a model's architecture, license, quantization level, prompt template, and default parameters before using it with ollama_chat or ollama_generate. Do not use this to list all models — use ollama_list_models instead. Do not use this to download new models — use ollama_pull_model instead. Prerequisites: The model must already be installed locally (verify with ollama_list_models). Behavior: Read-only, idempotent, safe to retry. No authentication required. No rate limits. Returns the same metadata for the same model every time. On model-not-found error, returns an error object without throwing. |
| ollama_pull_modelA | Download a model from the Ollama library to the local machine. Use this tool when a model is needed but not yet installed locally. Do not use this if the model is already available — call ollama_list_models first to check. Do not use this to run inference — use ollama_chat or ollama_generate after pulling. Behavior: WRITE operation — downloads large files (1–100+ GB) and stores them on disk. Idempotent — re-pulling an already-installed model is safe and verifies integrity. No authentication required. No rate limits. Execution time ranges from seconds to hours depending on model size and network bandwidth. Not destructive (does not delete existing data). On network failure, returns an error object without throwing. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/VrtxOmega/Ollama-Omega'
If you have feedback or need assistance with the MCP directory API, please join our Discord server