Schema | rageval-mcp

rageval-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
No arguments

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name Description

Name	Description
retrieveA	Retrieve the top-k passages for a query from the bundled sample corpus. Use this to see what context a RAG system would surface for a question, and how that changes with the retrieval method. The corpus is a fictional B2B SaaS knowledge base that ships with the server, so no setup or data is required. Args: query: The natural-language question or search string. k: How many passages to return (1 to 20). method: One of 'bm25', 'tfidf', 'dense', or 'hybrid' (default 'hybrid'). Returns: RetrieveResult with fields: - query (str), method (str), k (int), count (int) - passages: list of {rank, doc_id, chunk_id, score, text}, best first. Raises: A tool error if method='dense' but the optional 'sentence-transformers' extra is not installed; the message explains how to enable it.
evaluate_retrievalA	Score one retrieval method against the labeled question set. Runs every labeled question through the chosen retriever and averages four ranking metrics, so you can put a number on retrieval quality instead of eyeballing it. Args: method: Which retrieval strategy to score (default 'hybrid'). k: The cutoff for recall@k, precision@k, and nDCG@k (1 to 20, default 3). Returns: MetricsOut with fields: method, k, n_questions, recall_at_k, precision_at_k, mrr_at_k, ndcg_at_k. Each metric is in the range 0.0 to 1.0. Raises: A tool error if method='dense' and the optional dense extra is not installed.
compare_methodsA	Benchmark every available retrieval method side by side at cutoff k. The honest, defensible answer to "which retrieval strategy should we use?". Each method is scored over the full labeled question set and ranked by nDCG@k. Methods whose optional dependencies are missing (currently only 'dense') are reported under ``skipped`` with a reason rather than failing the whole call. Args: k: The cutoff applied to recall@k, precision@k, and nDCG@k (1 to 20, default 3). Returns: CompareResult with fields: - k (int), n_questions (int) - rows: list of {method, recall_at_k, precision_at_k, mrr_at_k, ndcg_at_k} - best_method (str): the runnable method with the highest nDCG@k - skipped: list of {method, reason} for methods that could not run.

retrieveA

Retrieve the top-k passages for a query from the bundled sample corpus.

Use this to see *what context a RAG system would surface* for a question, and how that
changes with the retrieval method. The corpus is a fictional B2B SaaS knowledge base that
ships with the server, so no setup or data is required.

Args:
    query: The natural-language question or search string.
    k: How many passages to return (1 to 20).
    method: One of 'bm25', 'tfidf', 'dense', or 'hybrid' (default 'hybrid').

Returns:
    RetrieveResult with fields:
        - query (str), method (str), k (int), count (int)
        - passages: list of {rank, doc_id, chunk_id, score, text}, best first.

Raises:
    A tool error if method='dense' but the optional 'sentence-transformers' extra is not
    installed; the message explains how to enable it.

evaluate_retrievalA

Score one retrieval method against the labeled question set.

Runs every labeled question through the chosen retriever and averages four ranking
metrics, so you can put a number on retrieval quality instead of eyeballing it.

Args:
    method: Which retrieval strategy to score (default 'hybrid').
    k: The cutoff for recall@k, precision@k, and nDCG@k (1 to 20, default 3).

Returns:
    MetricsOut with fields: method, k, n_questions, recall_at_k, precision_at_k,
    mrr_at_k, ndcg_at_k. Each metric is in the range 0.0 to 1.0.

Raises:
    A tool error if method='dense' and the optional dense extra is not installed.

compare_methodsA

Benchmark every available retrieval method side by side at cutoff k.

The honest, defensible answer to "which retrieval strategy should we use?". Each method
is scored over the full labeled question set and ranked by nDCG@k. Methods whose optional
dependencies are missing (currently only 'dense') are reported under ``skipped`` with a
reason rather than failing the whole call.

Args:
    k: The cutoff applied to recall@k, precision@k, and nDCG@k (1 to 20, default 3).

Returns:
    CompareResult with fields:
        - k (int), n_questions (int)
        - rows: list of {method, recall_at_k, precision_at_k, mrr_at_k, ndcg_at_k}
        - best_method (str): the runnable method with the highest nDCG@k
        - skipped: list of {method, reason} for methods that could not run.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/phillipkaraya/rageval-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server