Skip to main content
Glama
phillipkaraya

rageval-mcp

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
retrieveA

Retrieve the top-k passages for a query from the bundled sample corpus.

Use this to see *what context a RAG system would surface* for a question, and how that
changes with the retrieval method. The corpus is a fictional B2B SaaS knowledge base that
ships with the server, so no setup or data is required.

Args:
    query: The natural-language question or search string.
    k: How many passages to return (1 to 20).
    method: One of 'bm25', 'tfidf', 'dense', or 'hybrid' (default 'hybrid').

Returns:
    RetrieveResult with fields:
        - query (str), method (str), k (int), count (int)
        - passages: list of {rank, doc_id, chunk_id, score, text}, best first.

Raises:
    A tool error if method='dense' but the optional 'sentence-transformers' extra is not
    installed; the message explains how to enable it.
evaluate_retrievalA

Score one retrieval method against the labeled question set.

Runs every labeled question through the chosen retriever and averages four ranking
metrics, so you can put a number on retrieval quality instead of eyeballing it.

Args:
    method: Which retrieval strategy to score (default 'hybrid').
    k: The cutoff for recall@k, precision@k, and nDCG@k (1 to 20, default 3).

Returns:
    MetricsOut with fields: method, k, n_questions, recall_at_k, precision_at_k,
    mrr_at_k, ndcg_at_k. Each metric is in the range 0.0 to 1.0.

Raises:
    A tool error if method='dense' and the optional dense extra is not installed.
compare_methodsA

Benchmark every available retrieval method side by side at cutoff k.

The honest, defensible answer to "which retrieval strategy should we use?". Each method
is scored over the full labeled question set and ranked by nDCG@k. Methods whose optional
dependencies are missing (currently only 'dense') are reported under ``skipped`` with a
reason rather than failing the whole call.

Args:
    k: The cutoff applied to recall@k, precision@k, and nDCG@k (1 to 20, default 3).

Returns:
    CompareResult with fields:
        - k (int), n_questions (int)
        - rows: list of {method, recall_at_k, precision_at_k, mrr_at_k, ndcg_at_k}
        - best_method (str): the runnable method with the highest nDCG@k
        - skipped: list of {method, reason} for methods that could not run.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/phillipkaraya/rageval-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server