COMPLETE-LOCAL-SETUP.md•5.18 kB
# Complete Local-First AI Development Platform
**Revolutionary architecture: Zero external dependencies, SOTA performance**
---
## 🧠 **Complete Local Stack**
### **Embeddings (Code Understanding)**
- **ONNX**: Fast, general-purpose embeddings
- **Ollama**: Code-specialized embeddings with nomic-embed-code
### **Analysis (Semantic Intelligence)**
- **Qwen2.5-Coder-14B-128K**: SOTA code analysis with 128K context
### **Infrastructure**
- **Local Models**: All processing on your hardware
- **Privacy First**: Code never leaves your machine
- **Performance**: Optimized for 32GB MacBook Pro
---
## 🚀 **Setup Commands**
### **1. Install Models**
```bash
# Install SOTA code analysis model
ollama pull hf.co/unsloth/Qwen2.5-Coder-14B-Instruct-128K-GGUF:Q4_K_M
# Install SOTA code embedding model
ollama pull hf.co/nomic-ai/nomic-embed-code-GGUF:Q4_K_M
```
### **2. Build with All Features**
```bash
# Build complete local stack
MACOSX_DEPLOYMENT_TARGET=11.0 cargo build -p codegraph-mcp \
--features "qwen-integration,faiss,embeddings,embeddings-ollama,codegraph-vector/onnx"
```
---
## ⚡ **Performance Comparison**
### **ONNX Embeddings (Speed Optimized)**
```bash
export CODEGRAPH_EMBEDDING_PROVIDER=onnx
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2
# Index with speed-optimized embeddings
./target/debug/codegraph index . --force --languages typescript
# Expected: Fast indexing, good general semantic search
```
### **Ollama Embeddings (Code-Specialized)**
```bash
export CODEGRAPH_EMBEDDING_PROVIDER=ollama
export CODEGRAPH_EMBEDDING_MODEL=nomic-embed-code
# Index with code-specialized embeddings
./target/debug/codegraph index . --force --languages typescript
# Expected: Superior code understanding, better search relevance
```
---
## 🎯 **Environment Variables**
### **Embedding Provider Selection**
```bash
# Choose embedding provider
export CODEGRAPH_EMBEDDING_PROVIDER=ollama # or "onnx" or "local" or "openai"
# Ollama-specific settings
export CODEGRAPH_EMBEDDING_MODEL=nomic-embed-code
export CODEGRAPH_OLLAMA_URL=http://localhost:11434
# ONNX-specific settings (alternative)
export CODEGRAPH_EMBEDDING_PROVIDER=onnx
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2
```
### **Analysis Model Settings**
```bash
# Qwen2.5-Coder configuration
export CODEGRAPH_MODEL="hf.co/unsloth/Qwen2.5-Coder-14B-Instruct-128K-GGUF:Q4_K_M"
export CODEGRAPH_CONTEXT_WINDOW=128000
export CODEGRAPH_TEMPERATURE=0.1
```
---
## 📊 **Expected Performance**
### **ONNX Provider**
```yaml
Strengths:
- Fast indexing (1000s of embeddings/minute)
- Low memory usage (~2GB additional)
- General-purpose semantic understanding
Use Case:
- Quick prototyping and testing
- Resource-constrained environments
- General codebase exploration
```
### **Ollama Provider (nomic-embed-code)**
```yaml
Strengths:
- Code-specialized understanding
- Better semantic search relevance
- Superior code pattern recognition
- Designed specifically for source code
Use Case:
- Production development environments
- Critical code understanding tasks
- Maximum search quality and relevance
```
---
## 🔬 **Quality Comparison Test**
### **Test Semantic Search Quality**
```bash
# Test with authentication-related query
echo '{"jsonrpc":"2.0","id":1,"method":"codegraph.enhanced_search","params":{"query":"user authentication validation pattern"}}' | \
CODEGRAPH_EMBEDDING_PROVIDER=ollama ./target/debug/codegraph start stdio
# Compare results with different providers
echo '{"jsonrpc":"2.0","id":1,"method":"codegraph.enhanced_search","params":{"query":"user authentication validation pattern"}}' | \
CODEGRAPH_EMBEDDING_PROVIDER=onnx ./target/debug/codegraph start stdio
```
### **Expected Quality Differences**
- **ONNX**: Good general matches, may miss code-specific nuances
- **Ollama**: Superior code understanding, better pattern recognition
---
## 🎉 **Revolutionary Achievement**
### **Complete Local-First Platform**
- ✅ **Zero External Dependencies**: Everything runs locally
- ✅ **SOTA Models**: Best-in-class for both embeddings and analysis
- ✅ **Choice of Optimization**: Speed vs. code-specialized quality
- ✅ **Privacy Preserving**: Code never leaves your machine
- ✅ **Cost Efficient**: No API costs for any operations
### **Competitive Advantages**
- **No Competitor Has This**: Complete local stack with SOTA models
- **Dual Optimization**: Speed when needed, quality when critical
- **Future Proof**: Local models improve without vendor dependency
- **Platform Strategy**: Enhance any LLM with impossible-to-replicate intelligence
---
## 🚀 **Usage Recommendations**
### **For Development/Testing**
```bash
export CODEGRAPH_EMBEDDING_PROVIDER=onnx # Fast iteration
```
### **For Production/Critical Work**
```bash
export CODEGRAPH_EMBEDDING_PROVIDER=ollama # Best quality
```
### **For Hybrid Approach**
Switch between providers based on task:
- Quick exploration: ONNX
- Deep analysis: Ollama
- Both use same Qwen2.5-Coder for revolutionary semantic intelligence
**You now have the most advanced local-first AI development platform possible!** 🎉