Skip to main content
Glama
vectara

Vectara MCP server

Official
by vectara

ask_vectara

Query Vectara's RAG system to retrieve search results and generate contextual responses using specified corpus keys and API parameters for accurate information extraction.

Instructions

Run a RAG query using Vectara, returning search results with a generated response.

Args:
    query: str, The user query to run - required.
    corpus_keys: list[str], List of Vectara corpus keys to use for the search - required. Please ask the user to provide one or more corpus keys. 
    api_key: str, The Vectara API key - required.
    n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
    n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
    lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
    max_used_search_results: int, The maximum number of search results to use - optional, default is 10.
    generation_preset_name: str, The name of the generation preset to use - optional, default is "vectara-summary-table-md-query-ext-jan-2025-gpt-4o".
    response_language: str, The language of the response - optional, default is "eng".

Returns:
    The response from Vectara, including the generated answer and the search results.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
api_keyNo
corpus_keysNo
generation_preset_nameNovectara-summary-table-md-query-ext-jan-2025-gpt-4o
lexical_interpolationNo
max_used_search_resultsNo
n_sentences_afterNo
n_sentences_beforeNo
queryYes
response_languageNoeng

Implementation Reference

  • The primary handler function for the 'ask_vectara' tool. Registered via @mcp.tool() decorator. Includes type-hinted parameters serving as input schema, comprehensive docstring describing args/returns, full execution logic including validation, API payload construction, Vectara query execution, response processing with citations extraction, and error handling.
    @mcp.tool()
    async def ask_vectara(
        query: str,
        ctx: Context,
        corpus_keys: list[str],
        n_sentences_before: int = 2,
        n_sentences_after: int = 2,
        lexical_interpolation: float = 0.005,
        max_used_search_results: int = 10,
        generation_preset_name: str = "vectara-summary-table-md-query-ext-jan-2025-gpt-4o",
        response_language: str = "eng",
    ) -> dict:
        """
        Run a RAG query using Vectara, returning search results with a generated response.
    
        Args:
            query: str, The user query to run - required.
            corpus_keys: list[str], List of Vectara corpus keys to use for the search - required. Please ask the user to provide one or more corpus keys.
            n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
            n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
            lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
            max_used_search_results: int, The maximum number of search results to use - optional, default is 10.
            generation_preset_name: str, The name of the generation preset to use - optional, default is "vectara-summary-table-md-query-ext-jan-2025-gpt-4o".
            response_language: str, The language of the response - optional, default is "eng".
    
        Note: API key must be configured first using 'setup_vectara_api_key' tool
    
        Returns:
            dict: Structured response containing:
                - "summary": Generated AI summary with markdown citations
                - "citations": List of citation objects with score, text, and metadata
                - "factual_consistency_score": Score indicating factual consistency (if available)
            On error, returns dict with "error" key containing error message.
        """
        # Validate parameters
        validation_error = _validate_common_parameters(query, corpus_keys)
        if validation_error:
            return {"error": validation_error}
    
        if ctx:
            ctx.info(f"Running Vectara RAG query: {query}")
    
        try:
            payload = _build_query_payload(
                query=query,
                corpus_keys=corpus_keys,
                n_sentences_before=n_sentences_before,
                n_sentences_after=n_sentences_after,
                lexical_interpolation=lexical_interpolation,
                max_used_search_results=max_used_search_results,
                generation_preset_name=generation_preset_name,
                response_language=response_language,
                enable_generation=True
            )
    
            result = await _call_vectara_query(payload, ctx)
    
            # Extract the generated summary from the response
            summary_text = ""
            if "summary" in result:
                summary_text = result["summary"]
            elif "answer" in result:
                summary_text = result["answer"]
            else:
                return {"error": f"Unexpected response format: {json.dumps(result, indent=2)}"}
    
            # Build citations list
            citations = []
            if "search_results" in result and result["search_results"]:
                for i, search_result in enumerate(result["search_results"], 1):
                    citation = {
                        "id": i,
                        "score": search_result.get("score", 0.0),
                        "text": search_result.get("text", ""),
                        "document_metadata": search_result.get("document_metadata", {})
                    }
                    citations.append(citation)
    
            # Build response dict
            response = {
                "summary": summary_text,
                "citations": citations
            }
    
            # Add factual consistency score if available
            if "factual_consistency_score" in result:
                response["factual_consistency_score"] = result["factual_consistency_score"]
    
            return response
    
        except Exception as e:
            return {"error": _format_error("Vectara RAG query", e)}
  • Helper function that makes the actual API request to Vectara's /query endpoint, used by ask_vectara.
    async def _call_vectara_query(
        payload: dict,
        ctx: Context = None,
        api_key_override: str = None
    ) -> dict:
        """Make API call to Vectara query endpoint"""
        return await _make_api_request(
            f"{VECTARA_BASE_URL}/query",
            payload,
            ctx,
            api_key_override,
            "query"
        )
  • Helper function that constructs the detailed JSON payload for the Vectara query API, incorporating all parameters from ask_vectara.
    def _build_query_payload(
        query: str,
        corpus_keys: list[str],
        n_sentences_before: int = 2,
        n_sentences_after: int = 2,
        lexical_interpolation: float = 0.005,
        max_used_search_results: int = 10,
        generation_preset_name: str = "vectara-summary-table-md-query-ext-jan-2025-gpt-4o",
        response_language: str = "eng",
        enable_generation: bool = True
    ) -> dict:
        """Build the query payload for Vectara API"""
        payload = {
            "query": query,
            "search": {
                "limit": 100,
                "corpora": [
                    {
                        "corpus_key": corpus_key,
                        "lexical_interpolation": lexical_interpolation
                    } for corpus_key in corpus_keys
                ],
                "context_configuration": {
                    "sentences_before": n_sentences_before,
                    "sentences_after": n_sentences_after
                },
                "reranker": {
                    "type": "customer_reranker",
                    "reranker_name": "Rerank_Multilingual_v1",
                    "limit": 100,
                    "cutoff": 0.2
                }
            },
            "save_history": True,
        }
    
        if enable_generation:
            payload["generation"] = {
                "generation_preset_name": generation_preset_name,
                "max_used_search_results": max_used_search_results,
                "response_language": response_language,
                "citations": {
                    "style": "markdown",
                    "url_pattern": "{doc.url}",
                    "text_pattern": "{doc.title}"
                },
                "enable_factual_consistency_score": True
            }
    
        return payload
  • Helper function for parameter validation (query, corpus_keys, API key), called at the start of ask_vectara.
    def _validate_common_parameters(query: str = "", corpus_keys: list[str] = None) -> str | None:
        """Validate common parameters used across Vectara tools.
    
        Returns:
            str: Error message if validation fails, None if valid
        """
        if not query:
            return "Query is required."
        if not corpus_keys:
            return "Corpus keys are required. Please ask the user to provide one or more corpus keys."
    
        # Check API key availability
        api_key = _get_api_key()
        if not api_key:
            return API_KEY_ERROR_MESSAGE
    
        return None
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes the tool's function (RAG query with response generation) and mentions required parameters, but lacks details on authentication needs (though 'api_key' is implied), rate limits, error handling, or what happens if corpus keys are invalid. It adds some context but falls short of comprehensive behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear opening sentence, followed by an 'Args:' section detailing parameters and a 'Returns:' section. It is appropriately sized for a complex tool with many parameters, though some sentences could be more concise (e.g., the parameter explanations are verbose but necessary).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (9 parameters, no annotations, no output schema), the description is partially complete. It covers the purpose, parameters, and return statement, but lacks information on output format, error cases, or dependencies. Without an output schema, more detail on the response structure would improve completeness for such a multifaceted tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds significant meaning beyond the input schema, which has 0% description coverage. It explains each parameter's purpose, required status, and default values (e.g., 'query: str, The user query to run - required'), compensating fully for the schema's lack of descriptions. This is essential given the 9 parameters with only 1 required.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Run a RAG query using Vectara') and resources ('returning search results with a generated response'). It distinguishes from the sibling tool 'search_vectara' by emphasizing the generation of a response alongside search results, which suggests 'search_vectara' might only return raw search results without generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (for RAG queries with Vectara) and includes guidance on required parameters like asking the user for corpus keys. However, it does not explicitly state when NOT to use it or mention alternatives like 'search_vectara' for non-generation searches, which would be needed for a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vectara/vectara-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server