glossary.md•5.58 kB
# Glossary
| **Term**                                 | Definition                                                                                                                                                                                                                                                                                                                                                                                                                       |
|------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Bi-encoder**                           | A model that compresses the meaning of a document or query into a single vector. Used in the first stage of retrieval.                                                                                                                                                                                                                                                                                                           |
| **Context Stuffing**                     | Overloading the context window with too much information, which can degrade LLM performance.                                                                                                                                                                                                                                                                                                                                     |
| **Context Window**                       | The maximum amount of text that an LLM can process at once.                                                                                                                                                                                                                                                                                                                                                                      |
| **LLM Recall**                           | The ability of an LLM to find specific information within its context window.                                                                                                                                                                                                                                                                                                                                                    |
| **Recall**                               | A metric that measures how many relevant documents are retrieved in a search.                                                                                                                                                                                                                                                                                                                                                    |
| **Reranker (Cross-encoder)**             | A model that takes a query and a document as input and outputs a similarity score. This score is used to reorder documents by relevance.                                                                                                                                                                                                                                                                                         |
| **Retrieval Augmented Generation (RAG)** | A technique that combines the power of Large Language Models (LLMs) with external knowledge sources to generate more accurate and comprehensive responses.                                                                                                                                                                                                                                                                       |
| **Semantic Search**                      | Searching for information based on the meaning of words and phrases, rather than just matching keywords.                                                                                                                                                                                                                                                                                                                         |
| **Two-Stage Retrieval**                  | A system that first retrieves a large set of potentially relevant documents using a fast retriever (like a vector search) and then reranks them using a more accurate similarity score generated by a slower reranker before presenting them to an LLM. This approach combines the speed of the first-stage retrieval with the accuracy of the second-stage reranking, resulting in a more efficient and effective RAG pipeline. |
| **Vector Search**                        | A technique used to perform semantic search by converting text into numerical vectors and comparing their proximity in a vector space.                                                                                                                                                                                                                                                                                           |