Skip to main content
Glama

LiveKit RAG Assistant

by THENABILMAN
README.md•5.78 kB
# šŸ’¬ LiveKit RAG Assistant **AI-powered semantic search and question-answering system for LiveKit documentation** ## šŸŽÆ Features - **Dual Search Modes**: Documentation (Pinecone) + Real-time Web Search (Tavily) - **Standard MCP Server**: Async LangChain integration with Model Context Protocol - **Fast Responses**: Groq LLM (llama-3.3-70b) with ultra-fast inference - **Semantic Search**: HuggingFace embeddings (384-dim) with vector indexing - **Source Attribution**: View exact sources for every answer - **Chat History**: Persistent conversation tracking with recent query access - **Query Validation**: Prevents invalid inputs with helpful error messages - **Copy-to-Clipboard**: One-click message sharing ## šŸš€ Quick Start ### Prerequisites - Python 3.10+ with conda - API Keys: GROQ, TAVILY, PINECONE, HuggingFace ### Installation ```bash # 1. Clone and setup environment cd c:\lg mcp ai conda create -n langmcp python=3.12 conda activate langmcp # 2. Install dependencies pip install -r requirements.txt # 3. Configure .env file echo "GROQ_API_KEY=your_key" >> .env echo "TAVILY_API_KEY=your_key" >> .env echo "PINECONE_API_KEY=your_key" >> .env echo "PINECONE_INDEX_NAME=livekit-docs" >> .env ``` ### Running the Application **Terminal 1** - Start MCP Server: ```bash python mcp_server_standard.py ``` **Terminal 2** - Start Streamlit UI: ```bash streamlit run app.py ``` The app opens at `http://localhost:8501` ## šŸ“Š Architecture ``` Streamlit UI (app.py) ↓ Query Validation → Error Handling ↓ MCP Server (subprocess) → mcp_server_standard.py ↓ Dual Search Layer: ā”œā”€ Pinecone (3,007 vectors) - Semantic search └─ Tavily API - Real-time web results ↓ LLM Layer (Groq): ā”œā”€ Temperature: 0.3 (detailed, focused) ā”œā”€ Max Tokens: 2048 (comprehensive answers) └─ Model: llama-3.3-70b-versatile ↓ Response Display + Source Attribution ``` ## šŸ”§ Tech Stack | Component | Technology | Purpose | |-----------|-----------|---------| | **Frontend** | Streamlit 1.28+ | Premium glassmorphism UI | | **Backend** | MCP Standard | Async subprocess server | | **LLM** | Groq API | Ultra-fast inference (free tier) | | **Embeddings** | HuggingFace | sentence-transformers/all-MiniLM-L6-v2 | | **Vector DB** | Pinecone Serverless | Ultra-fast similarity search (AWS us-east-1) | | **Web Search** | Tavily API | Real-time internet search | | **Framework** | LangChain | LLM orchestration & tools | | **Language** | Python 3.12 | Modern syntax & features | ## šŸ“š Project Structure ``` c:\lg mcp ai\ ā”œā”€ā”€ app.py # Main Streamlit application (enhanced) ā”œā”€ā”€ mcp_server_standard.py # MCP server with LangChain tools ā”œā”€ā”€ ingest_docs_quick.py # Document ingestion to Pinecone ā”œā”€ā”€ requirements.txt # Python dependencies ā”œā”€ā”€ .env # API keys & configuration ā”œā”€ā”€ README.md # This file └── ingest_docs.py # Legacy ingestion script ``` ## šŸŽ® Usage ### Ask Questions 1. **Choose Search Mode**: Documentation (šŸ“š) or Web Search (🌐) 2. **Type Question**: Natural language queries work best 3. **Get Answer**: AI responds with detailed 3-5 sentence answers 4. **View Sources**: Click "View Sources" to see cited documents ### Features - **Copy Messages**: šŸ“‹ Click button on any message to copy - **Recent Queries**: ↻ Quick re-ask from history - **Quick Help**: šŸ’” Expandable tips and usage guide - **Performance Metrics**: šŸ“Š Real-time statistics in System Status - **Error Prevention**: āœ… Query validation with helpful feedback ## ⚔ Performance - **First Query**: ~15-20s (model initialization) - **Subsequent Queries**: 3-8s (cached LLM) - **Copy/History**: Instant (client-side) - **Metrics Update**: Real-time (no overhead) ## šŸ“– Example Queries - "How do I set up LiveKit?" - "What are the best practices for video conferencing?" - "How do I implement real-time communication?" - "What authentication methods are available?" - "How do I handle bandwidth optimization?" - "Deploy to Kubernetes - how does LiveKit handle it?" ## šŸ› ļø Configuration Edit `.env` file: ```env GROQ_API_KEY=gsk_your_groq_key TAVILY_API_KEY=tvly_your_tavily_key PINECONE_API_KEY=your_pinecone_key PINECONE_INDEX_NAME=livekit-docs HF_TOKEN=optional_huggingface_token ``` ## šŸ”„ Ingestion Populate Pinecone with LiveKit documentation: ```bash python ingest_docs_quick.py ``` This creates 3,007 searchable vector chunks from LiveKit docs. ## šŸ“Š Status Checks View system status in Streamlit sidebar: ``` āœ… MCP Server Ready - Status indicator āœ… Groq LLM - API connection āœ… Pinecone VectorDB - Index status šŸ’¬ Messages - Total message count šŸ‘¤ Questions - User query count šŸ¤– Responses - AI response count ``` ## 🚨 Troubleshooting | Issue | Solution | |-------|----------| | "No relevant documentation found" | Try web search mode or different keywords | | "MCP Server not found" | Ensure `mcp_server_standard.py` is running in Terminal 1 | | Slow first response | Normal - model loads on first query (~15-20s) | | API key errors | Check `.env` file and verify all keys are set | | Empty Pinecone index | Run `python ingest_docs_quick.py` to populate | ## šŸ“ Notes - All chat history saved in session state - Supports semantic search with keyword fallback - Responses stored with source attribution - Query validation prevents invalid inputs - Performance optimized for fast inference ## šŸ‘Øā€šŸ’» Created By **@THENABILMAN** - [GitHub](https://github.com/THENABILMAN) ## šŸ“„ License Built with ā¤ļø for developers. Feel free to modify and extend! --- **Version**: Enhanced v1.0 | **Status**: āœ… Production Ready | **Date**: November 1, 2025

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/THENABILMAN/THENABILMAN_LiveKit_MCP_Assistant'

If you have feedback or need assistance with the MCP directory API, please join our Discord server