Code Graph Knowledge System

overview.md•8.78 KiB

# Knowledge RAG Overview Knowledge RAG (Retrieval-Augmented Generation) is the document processing and intelligent Q&A system of Code Graph Knowledge System. It combines vector search, graph databases, and LLM integration to provide context-aware answers to questions about your documents. ## What is Knowledge RAG? Knowledge RAG transforms your documents into an intelligent knowledge base that can: - **Understand Context**: Process documents and extract semantic meaning - **Find Relevant Information**: Use vector similarity to find related content - **Generate Intelligent Answers**: Use LLMs to synthesize information from multiple sources - **Maintain Relationships**: Store knowledge as a graph with rich connections ## Architecture ``` Documents → Chunking → Embeddings → Neo4j Graph + Vector Index ↓ Query → Vector Search + Graph Traversal → LLM → Intelligent Answer ``` ### Key Components 1. **Document Processing** - Chunking: Break documents into semantic chunks (configurable size) - Embedding: Convert text to vector representations - Graph Storage: Store chunks as nodes with relationships 2. **Query Engine** - Vector Search: Find similar content using embeddings - Graph Traversal: Navigate relationships between nodes - LLM Generation: Synthesize answers from retrieved context 3. **Multi-Provider Support** - **LLM Providers**: Ollama, OpenAI, Google Gemini, OpenRouter - **Embedding Providers**: Ollama, OpenAI, Google Gemini, HuggingFace ## Feature Set ### Document Processing - ✅ Text files (.txt, .md, .rst) - ✅ Code files (all major languages) - ✅ PDF documents - ✅ Web pages (HTML) - ✅ Batch directory processing - ✅ Recursive subdirectory scanning ### Query Modes - **Hybrid** (Default): Combines vector search + graph traversal for best results - **Vector Only**: Pure similarity search using embeddings - **Graph Only**: Uses only graph relationships ### Intelligent Features - **Semantic Search**: Find documents by meaning, not just keywords - **Context-Aware Answers**: LLM generates answers using relevant sources - **Source Attribution**: Every answer includes source nodes - **Relationship Discovery**: Find connections between documents ## Deployment Modes Knowledge RAG is available **only in Full mode** because it requires both LLM and embedding models. ### Full Mode Requirements - ✅ Neo4j database with vector index support - ✅ LLM provider (for answer generation) - ✅ Embedding provider (for vector search) ### Not Available In: - ❌ Lite mode (no LLM/embeddings) - ❌ Graph-only mode (no RAG features) ## Quick Start Example ### 1. Add Documents ```python # Via MCP Tool { "tool": "add_document", "input": { "content": "Machine learning is a subset of artificial intelligence...", "title": "ML Introduction", "metadata": {"type": "tutorial", "difficulty": "beginner"} } } ``` ### 2. Query Knowledge Base ```python # Via MCP Tool { "tool": "query_knowledge", "input": { "question": "What is machine learning?", "mode": "hybrid" } } # Response: { "answer": "Machine learning is a subset of artificial intelligence that...", "sources": [ {"title": "ML Introduction", "content": "...", "score": 0.92} ] } ``` ### 3. Search Similar Content ```python # Via MCP Tool { "tool": "search_similar_nodes", "input": { "query": "neural networks", "top_k": 5 } } ``` ## Use Cases ### 1. Documentation Search Build searchable knowledge bases from your documentation: - Technical documentation - API references - User manuals - Internal wikis ### 2. Codebase Understanding Index your codebase for intelligent code search: - Find implementations by description - Understand code context - Discover related components - Navigate large codebases ### 3. Research Assistant Create research knowledge bases: - Academic papers - Research notes - Literature reviews - Citation discovery ### 4. Customer Support Build intelligent support systems: - Product documentation - FAQ databases - Troubleshooting guides - Knowledge articles ### 5. Learning Platform Create interactive learning experiences: - Course materials - Tutorial content - Educational resources - Study guides ## Configuration Knowledge RAG is configured via environment variables. Key settings: ```bash # Required for Knowledge RAG DEPLOYMENT_MODE=full ENABLE_KNOWLEDGE_RAG=true # LLM Configuration LLM_PROVIDER=ollama # ollama/openai/gemini/openrouter OLLAMA_MODEL=llama3.2 # or gpt-4, gemini-pro, etc. # Embedding Configuration EMBEDDING_PROVIDER=ollama # ollama/openai/gemini/huggingface OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Processing Settings CHUNK_SIZE=512 # Tokens per chunk CHUNK_OVERLAP=50 # Overlap between chunks TOP_K=5 # Number of results to retrieve # Timeout Settings OPERATION_TIMEOUT=120 # Standard operations (seconds) LARGE_DOCUMENT_TIMEOUT=300 # Large document processing (seconds) ``` ## System Requirements ### With Local LLM (Ollama) - **CPU**: 8+ cores recommended - **RAM**: 16GB minimum (32GB for large models) - **GPU**: Optional but highly recommended (8GB+ VRAM) - **Storage**: 50GB+ for models and data ### With Cloud LLM (OpenAI/Gemini) - **CPU**: 4+ cores - **RAM**: 8GB minimum - **Storage**: 20GB+ for data - **Network**: Stable internet connection ## Performance Characteristics ### Processing Speed - **Small documents** (<10KB): Synchronous, <1s - **Medium documents** (10-50KB): Async queue, 1-10s - **Large documents** (>50KB): Async queue, 10-60s - **Directories**: Async queue, varies by size ### Query Performance - **Vector search**: 50-200ms - **Hybrid mode**: 100-500ms - **LLM generation**: 1-5s (local), 0.5-2s (cloud) ### Scaling Considerations - **Document size**: Up to 10MB per document recommended - **Total documents**: Scales to millions with proper Neo4j tuning - **Concurrent queries**: 10-50 depending on hardware - **Embedding cache**: Speeds up repeated queries ## Integration Points Knowledge RAG integrates with other system components: ### 1. Task Queue System - Async processing for large documents - Background directory ingestion - Progress tracking - Error handling and retries ### 2. MCP Tools - 5 knowledge tools available via MCP - Integration with Claude Desktop, VS Code - Real-time query capabilities ### 3. Memory Store - Suggest memories from Q&A sessions - Auto-extract knowledge from queries - Cross-reference with project memories ### 4. Code Graph - Complement code-specific analysis - Provide documentation context - Enhance code understanding ## Limitations and Considerations ### Current Limitations 1. **Text-based only**: Images and binary files not supported 2. **Token limits**: Large documents must fit in LLM context window 3. **Language**: Best results with English (depends on embedding model) 4. **Real-time**: Not suitable for rapidly changing documents ### Best Practices 1. **Document size**: Keep documents focused and well-structured 2. **Chunking**: Adjust chunk size for your content type 3. **Metadata**: Add rich metadata for better filtering 4. **Updates**: Re-process documents when content changes 5. **Query formulation**: Ask specific, well-formed questions ## Security and Privacy ### Data Storage - Documents stored in Neo4j database - Embeddings stored as node properties - No external data transmission (with local LLM) ### Privacy Options - **Full privacy**: Use Ollama for local processing - **Cloud processing**: OpenAI/Gemini send data to cloud - **Hybrid**: Local embeddings + cloud LLM ### Access Control - No built-in authentication (add via reverse proxy) - Neo4j database access control - MCP tool isolation per user ## Cost Considerations ### Local Deployment (Ollama) - **Hardware**: $0-2000 one-time (GPU recommended) - **Hosting**: $40-200/month (VPS/cloud) - **LLM**: $0 (free) - **Embeddings**: $0 (free) - **Total ongoing**: $40-200/month ### Cloud Deployment (OpenAI) - **Hosting**: $10-20/month (small VPS) - **LLM**: $0.01-0.10 per query (GPT-4o-mini) - **Embeddings**: $0.0001 per 1K tokens - **Total**: $50-500/month (usage-dependent) ### Hybrid Deployment - **Hosting**: $10-20/month - **LLM**: $0.01-0.10 per query - **Embeddings**: $0 (local Ollama) - **Total**: $30-300/month ## Next Steps - **[Document Processing Guide](documents.md)**: Learn how to add and manage documents - **[Query Guide](query.md)**: Master intelligent querying techniques - **[MCP Integration](../mcp/overview.md)**: Connect to AI assistants - **[Full Mode Deployment](../../deployment/full.md)**: Deploy with all features ## Additional Resources - **Examples**: See `examples/` directory for code samples - **API Reference**: HTTP REST API documentation - **MCP Tools**: Tool definitions and schemas - **Configuration**: Complete `.env` settings guide

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/royisme/codebase-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

overview.md•8.78 KiB