Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
LLM_MODELNoThe LLM model to use, e.g., llama3:latest or mistral:latestllama3:latest
CHUNK_SIZENoTokens per chunk. Smaller = more precise retrieval256
EMBED_MODELNoThe embedding model to use, e.g., qwen3-embedding:0.6b or nomic-embed-text:latestqwen3-embedding:0.6b
CHUNK_OVERLAPNoOverlap between chunks. Helps preserve context at boundaries25
RESPONSE_MODENoResponse mode: compact, tree_summarize, or refinecompact
SIMILARITY_TOP_KNoNumber of chunks retrieved per query5

Capabilities

Server capabilities have not been inspected yet.

Tools

Functions exposed to the LLM to take actions

NameDescription

No tools

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Hassan-Butt4356/mcp-rag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server