Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
OLLAMA_URLNoBase URL of the Ollama serverhttp://localhost:11434
OLLAMA_NUM_CTXNoContext window in tokens32768
OLLAMA_TIMEOUT_SNoPer-request timeout, seconds600
OLLAMA_KEEP_ALIVENoHow long to keep the model resident in VRAM30m
OLLAMA_DEFAULT_MODELNoDefault model for handoffsqwen2.5-coder:14b

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
ask_localA

Send a one-shot prompt to a local Ollama model and return the response.

Use for any handoff where the cloud model's full reasoning isn't needed: drafts, boilerplate, simple extractions, formatting, quick lookups. Runs on the user's own GPU and consumes no cloud-LLM usage.

Args: prompt: The task / question. model: Override the default model. system: Optional system prompt to shape behavior.

chat_localA

Multi-turn chat against a local Ollama model.

Use when the handoff needs more than one turn of context. messages is a list of {"role": "user"|"assistant"|"system", "content": str}.

summarize_localB

Summarize a block of text using the local model.

Cheap offload for long files, logs, transcripts, or docs the cloud model doesn't need to fully ingest. Returns a concise structured summary.

Args: text: The content to summarize. Can be very long (context window is configurable). focus: Optional focus hint, e.g. "errors and stack traces" or "API surface only".

code_review_localA

Quick first-pass code review using the local coder model.

Catches obvious bugs, style issues, and risky patterns. Use as a cheap pre-filter before asking the cloud model for a deeper review.

Args: diff_or_code: A unified diff or a code block to review.

draft_commit_message_localA

Draft a conventional-style commit message from a diff using the local model.

Cheap and fast — good for routine commits where the cloud model's analysis isn't needed.

Args: diff: The output of git diff --staged or similar.

extract_localB

Extract specific information from a text block using the local model.

Good for pulling structured data out of unstructured text — function names, URLs, error messages, TODO comments, etc.

Args: text: The source text. what_to_extract: What to pull out, e.g. "all function definitions" or "every URL in the file".

list_modelsA

List the Ollama models available locally.

server_infoA

Return the server's effective configuration (model, context size, etc.).

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Michael-WhiteCapData/ollama-handoff'

If you have feedback or need assistance with the MCP directory API, please join our Discord server