Schema | mcp-rag-assistant

mcp-rag-assistant

Describes the environment variables required to run the server.

Name	Required	Description	Default
`LLM_MODEL`	No	The LLM model to use, e.g., llama3:latest or mistral:latest	llama3:latest
`CHUNK_SIZE`	No	Tokens per chunk. Smaller = more precise retrieval	256
`EMBED_MODEL`	No	The embedding model to use, e.g., qwen3-embedding:0.6b or nomic-embed-text:latest	qwen3-embedding:0.6b
`CHUNK_OVERLAP`	No	Overlap between chunks. Helps preserve context at boundaries	25
`RESPONSE_MODE`	No	Response mode: compact, tree_summarize, or refine	compact
`SIMILARITY_TOP_K`	No	Number of chunks retrieved per query	5

Server capabilities have not been inspected yet.

Functions exposed to the LLM to take actions

Name	Description
No tools

Interactive templates invoked by user choice

Name	Description
No prompts

Contextual data attached and managed by the client

Name	Description
No resources

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Hassan-Butt4356/mcp-rag-assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server