Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
OLLAMA_URLNoThe URL for the Ollama backend (e.g., http://localhost:11434). Can be skipped if not using this backend.http://localhost:11434
CLIPROXYAPI_KEYNoThe local API key/passphrase for CLIProxyAPI, which must match the key defined in its config.yaml.sk-my-local-key
CLIPROXYAPI_URLNoThe URL for the CLIProxyAPI backend (e.g., http://localhost:8317). Can be skipped if not using this backend.http://localhost:8317

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
list_modelsA

List all available models across all providers. Run this first to see what you can query.

ask_modelA

Query any AI model with a prompt. Returns the model's response with metadata.

OUTPUT: Markdown with the model's response, latency, and token usage. If max_response_tokens is set and compression occurred, includes distillation metadata (original tokens, compressed tokens, compressor model, compressor latency). Shows "Saved: X tokens (Y% smaller)" when compression is active. Shows "(cached)" when response is served from cache.

WHEN TO USE: When you need another model's perspective, analysis, or capabilities. Set max_response_tokens to control how much of your context window this response consumes — the response will be distilled by a fast model to fit the budget while preserving code, file paths, errors, and actionable details. Set include_raw=true to see both compressed and original responses for quality verification.

FAILURE MODES:

  • "Model query failed (4xx/5xx)" → The model or provider is unavailable. Try a different model or check that CLIProxyAPI/Ollama is running.

  • "circuit breaker open" → The model failed too many times recently. Try a different model or wait for automatic recovery.

  • Compression silently skipped → If the compressor model is unavailable or the response already fits the budget, the raw response is returned unchanged. This is not an error.

compare_modelsA

Query 2-5 models in parallel with the same prompt. Returns side-by-side comparison with latency and token metrics.

consensusB

Query 3-7 models and aggregate responses using voting strategy (majority/supermajority/unanimous). Returns consensus answer with confidence score.

synthesizeB

Query 2-5 models in parallel, then combine their best ideas into one answer. Returns a synthesized response that's better than any single model.

session_recapA

Read previous Claude Code sessions from disk and generate a smart-sized recap using a large-context model. Claude never sees the raw session data — only the distilled summary.

OUTPUT: Returns markdown starting with "## Session Recap" containing sections: Project State, What Was Built, Key Decisions, Errors Resolved, Unfinished/In Progress, File Map. Empty sections are omitted. Output size is auto-calculated (1K-30K tokens) based on session density.

WHEN TO USE: At the start of a new session when the user asks to restore context, recall previous work, or continue where they left off.

FAILURE MODES:

  • "No recent project detected" + list of available projects → Retry with an explicit project path from the list.

  • "Project directory not found" + available projects → The project path was misspelled or encoded wrong. Retry with a path from the available list.

  • "No session files found" → The project directory exists but has no sessions. Try a different project.

  • "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.

  • "Session Recap Failed" with error details → Both summarization passes failed. Retry with fewer sessions (sessions=1) or a different model.

  • "Triage Only" heading → Partial success. The triage pass worked but the full recap failed. The output still contains useful structured data. Do not retry.

analyze_fileA

Offload file analysis to a worker model. The file is read server-side — it never enters your context window. You send a file path and a question, and get back only the analysis.

OUTPUT: Markdown with the model's analysis of the file, including file metadata (path, lines, chars), latency, and token usage. If max_response_tokens is set and compression occurred, includes distillation metadata (original tokens, compressed tokens, compressor model, compressor latency).

WHEN TO USE: When you need to analyze, review, or search a file but want to avoid reading it yourself. Especially valuable for large files (1000+ lines) where reading would consume significant context. The file is sent to a large-context model (Gemini 1M) that can process the entire file at once.

FAILURE MODES:

  • "File not found" → The path is wrong. Retry with the correct absolute path.

  • "Binary file detected" → Only text files are supported. Do not retry with this file.

  • "File too large" → The file exceeds 800K chars. Try analyzing a specific section or ask the user to split the file.

  • "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.

  • "Model query failed" → Try a different model or check provider status with list_models.

smart_readA

Surgical code extraction from files. Returns ONLY relevant code sections with line numbers — not analysis.

OUTPUT: Markdown with extracted code sections (verbatim, with line numbers), minimal annotations, file metadata, latency, token usage. Shows "Context saved" metric. Unlike analyze_file which returns prose analysis, smart_read returns actual code you can act on directly.

WHEN TO USE: When you need to read a file but only care about specific sections. Use instead of the Read tool when you have a specific intent like "find the auth logic", "show error handling", "extract the database schema". Especially valuable for large files (1000+ lines) where reading the whole file wastes context tokens. For general questions about a file, use analyze_file instead.

FAILURE MODES:

  • "File not found" → The path is wrong. Retry with the correct absolute path.

  • "Binary file detected" → Only text files are supported. Do not retry with this file.

  • "File too large" → The file exceeds 800K chars. Try a specific section.

  • "No models available" → CLIProxyAPI or Ollama is not running. Tell the user to start their model provider.

  • "No relevant sections found" → Try a broader query, or use analyze_file for general analysis.

  • "Model query failed" → Try a different model or check provider status with list_models.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Pickle-Pixel/HydraMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server