compare_methods
Benchmark retrieval methods side by side at cutoff k. Compare recall, precision, MRR, and nDCG@k to identify the best strategy for your RAG pipeline.
Instructions
Benchmark every available retrieval method side by side at cutoff k.
The honest, defensible answer to "which retrieval strategy should we use?". Each method
is scored over the full labeled question set and ranked by nDCG@k. Methods whose optional
dependencies are missing (currently only 'dense') are reported under ``skipped`` with a
reason rather than failing the whole call.
Args:
k: The cutoff applied to recall@k, precision@k, and nDCG@k (1 to 20, default 3).
Returns:
CompareResult with fields:
- k (int), n_questions (int)
- rows: list of {method, recall_at_k, precision_at_k, mrr_at_k, ndcg_at_k}
- best_method (str): the runnable method with the highest nDCG@k
- skipped: list of {method, reason} for methods that could not run.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| k | No | Cutoff k applied to every metric. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| k | Yes | ||
| n_questions | Yes | ||
| rows | Yes | ||
| best_method | Yes | ||
| skipped | Yes |