The Vectara MCP server provides a Model Context Protocol (MCP) compliant interface for AI systems to access Vectara's Trusted RAG platform, enabling fast, reliable RAG with reduced hallucination.
Core Capabilities:
RAG Query with Generation: Run semantic search queries with AI-generated responses using the
ask_vectaratoolSemantic Search Only: Perform search queries without generation using the
search_vectaratoolHallucination Detection & Correction: Identify and fix hallucinations in generated text using Vectara's VHC API with the
correct_hallucinationstoolFactual Consistency Evaluation: Assess how well generated text matches source documents using the
eval_factual_consistencytoolAPI Key Management: Securely configure and manage Vectara API keys using
setup_vectara_api_keyandclear_vectara_api_key
Infrastructure Features:
Multiple Transport Modes: Support for HTTP (recommended), SSE, and STDIO transport protocols
Production-Ready Security: Built-in bearer token authentication, rate limiting, CORS protection, and HTTPS readiness
Flexible Configuration: Customizable through command-line arguments and environment variables
MCP Integration: Compatible with MCP clients including Claude Desktop
Vectara MCP Server
🔌 Compatible with
Vectara MCP is also compatible with any MCP client
The Model Context Protocol (MCP) is an open standard that enables AI systems to interact seamlessly with various data sources and tools, facilitating secure, two-way connections.
Vectara-MCP provides any agentic application with access to fast, reliable RAG with reduced hallucination, powered by Vectara's Trusted RAG platform, through the MCP protocol.
Installation
You can install the package directly from PyPI:
Related MCP server: Azure MCP Server
Quick Start
Secure by Default (HTTP/SSE with Authentication)
Local Development Mode (STDIO)
Configuration Options
Transport Modes
HTTP Transport (Default - Recommended)
Security: Built-in authentication via bearer tokens
Encryption: HTTPS ready
Rate Limiting: 100 requests/minute by default
CORS Protection: Configurable origin validation
Use Case: Production deployments, cloud environments
SSE Transport
Streaming: Server-Sent Events for real-time updates
Authentication: Bearer token support
Compatibility: Works with legacy MCP clients
Use Case: Real-time streaming applications
STDIO Transport
⚠️ Security Warning: No transport-layer security
Performance: Low latency for local communication
Use Case: Local development, Claude Desktop
Requirement: Must be explicitly enabled with
--stdioflag
Environment Variables
Authentication
HTTP/SSE Transport
When using HTTP or SSE transport, authentication is required by default:
Disabling Authentication (Development Only)
Available Tools
API Key Management
setup_vectara_api_key: Configure and validate your Vectara API key for the session (one-time setup).
Args:
api_key: str, Your Vectara API key - required.
Returns:
Success confirmation with masked API key or validation error.
clear_vectara_api_key: Clear the stored API key from server memory.
Returns:
Confirmation message.
Query Tools
ask_vectara: Run a RAG query using Vectara, returning search results with a generated response.
Args:
query: str, The user query to run - required.
corpus_keys: list[str], List of Vectara corpus keys to use for the search - required.
n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
max_used_search_results: int, The maximum number of search results to use - optional, default is 10.
generation_preset_name: str, The name of the generation preset to use - optional, default is "vectara-summary-table-md-query-ext-jan-2025-gpt-4o".
response_language: str, The language of the response - optional, default is "eng".
Returns:
The response from Vectara, including the generated answer and the search results.
search_vectara: Run a semantic search query using Vectara, without generation.
Args:
query: str, The user query to run - required.
corpus_keys: list[str], List of Vectara corpus keys to use for the search - required.
n_sentences_before: int, Number of sentences before the answer to include in the context - optional, default is 2.
n_sentences_after: int, Number of sentences after the answer to include in the context - optional, default is 2.
lexical_interpolation: float, The amount of lexical interpolation to use - optional, default is 0.005.
Returns:
The response from Vectara, including the matching search results.
Analysis Tools
correct_hallucinations: Identify and correct hallucinations in generated text using Vectara's VHC (Vectara Hallucination Correction) API.
Args:
generated_text: str, The generated text to analyze for hallucinations - required.
documents: list[str], List of source documents to compare against - required.
query: str, The original user query that led to the generated text - optional.
Returns:
JSON-formatted string containing corrected text and detailed correction information.
eval_factual_consistency: Evaluate the factual consistency of generated text against source documents using Vectara's dedicated factual consistency evaluation API.
Args:
generated_text: str, The generated text to evaluate for factual consistency - required.
documents: list[str], List of source documents to compare against - required.
query: str, The original user query that led to the generated text - optional.
Returns:
JSON-formatted string containing factual consistency evaluation results and scoring.
Note: API key must be configured first using setup_vectara_api_key tool or VECTARA_API_KEY environment variable.
Configuration with Claude Desktop
To use with Claude Desktop, update your configuration to use STDIO transport:
Or using uv:
Note: Claude Desktop requires STDIO transport. While less secure than HTTP, it's acceptable for local desktop use.
Usage in Claude Desktop App
Once the installation is complete, and the Claude desktop app is configured, you must completely close and re-open the Claude desktop app to see the Vectara-mcp server. You should see a hammer icon in the bottom left of the app, indicating available MCP tools, you can click on the hammer icon to see more detail on the Vectara-search and Vectara-extract tools.
Now claude will have complete access to the Vectara-mcp server, including all six Vectara tools.
Secure Setup Workflow
First-time setup (one-time per session):
Configure your API key securely:
After setup, use any tools without exposing your API key:
Vectara Tool Examples
RAG Query with Generation:
Semantic Search Only:
Hallucination Detection & Correction:
Factual Consistency Evaluation:
Security Best Practices
Always use HTTP transport for production - Never expose STDIO transport to the network
Keep authentication enabled - Only disable with
--no-authfor local testingUse HTTPS in production - Deploy behind a reverse proxy with TLS termination
Configure CORS properly - Set
VECTARA_ALLOWED_ORIGINSto restrict accessRotate API keys regularly - Update
VECTARA_API_KEYandVECTARA_AUTHORIZED_TOKENSMonitor rate limits - Default 100 req/min, adjust based on your needs
See SECURITY.md for detailed security guidelines.
Support
For issues, questions, or contributions, please visit: https://github.com/vectara/vectara-mcp