Schema | ollama-handoff

ollama-handoff

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`OLLAMA_URL`	No	Base URL of the Ollama server	http://localhost:11434
`OLLAMA_NUM_CTX`	No	Context window in tokens	32768
`OLLAMA_TIMEOUT_S`	No	Per-request timeout, seconds	600
`OLLAMA_KEEP_ALIVE`	No	How long to keep the model resident in VRAM	30m
`OLLAMA_DEFAULT_MODEL`	No	Default model for handoffs	qwen2.5-coder:14b

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
ask_localA	Send a one-shot prompt to a local Ollama model and return the response. Use for any handoff where the cloud model's full reasoning isn't needed: drafts, boilerplate, simple extractions, formatting, quick lookups. Runs on the user's own GPU and consumes no cloud-LLM usage. Args: prompt: The task / question. model: Override the default model. system: Optional system prompt to shape behavior.
chat_localA	Multi-turn chat against a local Ollama model. Use when the handoff needs more than one turn of context. `messages` is a list of `{"role": "user"\|"assistant"\|"system", "content": str}`.
summarize_localB	Summarize a block of text using the local model. Cheap offload for long files, logs, transcripts, or docs the cloud model doesn't need to fully ingest. Returns a concise structured summary. Args: text: The content to summarize. Can be very long (context window is configurable). focus: Optional focus hint, e.g. "errors and stack traces" or "API surface only".
code_review_localA	Quick first-pass code review using the local coder model. Catches obvious bugs, style issues, and risky patterns. Use as a cheap pre-filter before asking the cloud model for a deeper review. Args: diff_or_code: A unified diff or a code block to review.
draft_commit_message_localA	Draft a conventional-style commit message from a diff using the local model. Cheap and fast — good for routine commits where the cloud model's analysis isn't needed. Args: diff: The output of `git diff --staged` or similar.
extract_localB	Extract specific information from a text block using the local model. Good for pulling structured data out of unstructured text — function names, URLs, error messages, TODO comments, etc. Args: text: The source text. what_to_extract: What to pull out, e.g. "all function definitions" or "every URL in the file".
list_modelsA	List the Ollama models available locally.
server_infoA	Return the server's effective configuration (model, context size, etc.).

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Michael-WhiteCapData/ollama-handoff'

If you have feedback or need assistance with the MCP directory API, please join our Discord server