CodeGraph CLI MCP Server

LMSTUDIO_SETUP.md•11.8 KiB

# LM Studio Setup Guide for CodeGraph ## Overview CodeGraph now defaults to using **LM Studio** for superior MLX support and Flash Attention 2 performance on macOS. This guide covers setup for the recommended configuration: - **Embeddings**: Jina Code Embeddings 1.5B (1536 dimensions) - **LLM**: DeepSeek Coder v2 Lite Instruct Q4_K_M - **Benefits**: MLX acceleration, Flash Attention 2, optimized for code understanding ## Why LM Studio? **LM Studio Advantages:** - ✅ Native MLX support (10x faster on Apple Silicon) - ✅ Flash Attention 2 (2-3x memory efficiency + speedup) - ✅ Superior quantization (Q4_K_M > GGUF Q4) - ✅ Better model loading/management - ✅ OpenAI-compatible API (easy integration) **vs Ollama:** - Ollama: Good for general use, simpler setup - LM Studio: Better for production, higher performance on macOS ## Installation ### 1. Install LM Studio Download from [lmstudio.ai](https://lmstudio.ai/) ```bash # macOS (Apple Silicon recommended) # Download the .dmg and install open ~/Downloads/LMStudio-*.dmg ``` ### 2. Download Models **Option A: Via LM Studio UI (Recommended)** 1. Open LM Studio 2. Go to "Discover" tab 3. Search and download: - **Embeddings**: `jinaai/jina-code-embeddings-1.5b` (1.5B parameters, 1536-dim) - **LLM**: `lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF` - Select: `DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf` **Option B: Manual Download** ```bash # Download embeddings model huggingface-cli download jinaai/jina-code-embeddings-1.5b' \ --local-dir ~/.cache/lm-studio/models/jinaai/jina-code-embeddings-1.5b' # Download LLM model huggingface-cli download lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF \ DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf \ --local-dir ~/.cache/lm-studio/models/lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF ``` ### 3. Start LM Studio Server **Load Embedding Model:** 1. In LM Studio, go to "Local Server" tab 2. Load model: `jinaai/jina-embeddings-v3` 3. Start server on port `1234` (default) 4. Verify: `http://localhost:1234/v1/embeddings` **Server will handle both embeddings and LLM requests on port 1234** ## Configuration ### Quick Setup (Zero Config) CodeGraph defaults to LM Studio automatically: ```bash # Just start indexing - it will use LM Studio defaults codegraph index /path/to/project ``` ### Custom Configuration #### Option 1: Environment Variables Create `.env` file: ```bash # Embedding Configuration CODEGRAPH_EMBEDDING_PROVIDER=lmstudio CODEGRAPH_EMBEDDING_MODEL=jinaai/jina-embeddings-v3 CODEGRAPH_LMSTUDIO_URL=http://localhost:1234 CODEGRAPH_EMBEDDING_DIMENSION=1536 # LLM Configuration (optional, for insights) CODEGRAPH_LLM_PROVIDER=lmstudio CODEGRAPH_MODEL=lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF CODEGRAPH_CONTEXT_WINDOW=32000 CODEGRAPH_TEMPERATURE=0.1 # Clean TUI output during indexing RUST_LOG=warn ``` #### Option 2: Config File Create `.codegraph.toml`: ```toml [embedding] provider = "lmstudio" model = "jinaai/jina-code-embeddings-1.5b" lmstudio_url = "http://localhost:1234" dimension = 1536 batch_size = 64 [llm] enabled = false # Set true for local insights provider = "lmstudio" model = "lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF" lmstudio_url = "http://localhost:1234" context_window = 32000 temperature = 0.1 insights_mode = "context-only" [logging] level = "warn" # Clean TUI output ``` ## Usage ### Index a Codebase ```bash # Basic indexing (uses LM Studio defaults) codegraph index /path/to/project # With clean output (recommended) RUST_LOG=warn codegraph index /path/to/project # With custom batch size for GPU optimization codegraph index /path/to/project --batch-size 128 ``` **Expected Output:** ``` ⠁ Indexing project: /path/to/project ┌─────────────────────────────────────────────────────┐ │ 📦 Collecting files │ │ [████████████████████] 1250/1250 (100%) │ └─────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────┐ │ 🌳 Parsing AST │ │ [████████████████████] 1250/1250 (100%) │ └─────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────┐ │ 💾 Generating embeddings │ │ [████████████████████] 5432/5432 (100%) │ │ ⚡ 127.3 embeddings/sec │ └─────────────────────────────────────────────────────┘ 🎉 INDEXING COMPLETE! 📊 Performance Summary ┌─────────────────────────────────────────────────┐ │ ⏱️ Total time: 42.7s │ │ ⚡ Throughput: 29.28 files/sec │ ├─────────────────────────────────────────────────┤ │ 📄 Files: 1250 indexed, 12 skipped │ │ 📝 Lines: 87432 processed │ │ 💾 Embeddings: 5432 generated │ └─────────────────────────────────────────────────┘ ⚙️ Configuration Summary Workers: 8 | Batch Size: 64 | Languages: rust, typescript ✅ Excellent embedding success rate (>90%) ``` ### Start MCP Server ```bash # Start MCP server for Claude Desktop cd /path/to/project codegraph start stdio ``` ### Search Codebase ```bash # Semantic search with Jina embeddings codegraph search "authentication middleware" --limit 10 # Search specific languages codegraph search "database connection pool" --langs rust,go # Search specific paths codegraph search "API routes" --paths src/api,lib/routes ``` ## Performance Optimization ### Embedding Performance **Jina Code Embeddings 1.5B Performance:** - **M1/M2/M3 Max**: ~100-150 embeddings/sec (MLX) - **M1/M2/M3 Pro**: ~60-90 embeddings/sec (MLX) - **M1/M2/M3 Base**: ~30-50 embeddings/sec (MLX) - **Intel Mac**: ~15-25 embeddings/sec (CPU) **Optimization Tips:** ```bash # Increase batch size for larger GPU/Unified Memory codegraph index . --batch-size 128 # For 32GB+ unified memory codegraph index . --batch-size 64 # For 16-24GB unified memory (default) codegraph index . --batch-size 32 # For 8-16GB unified memory ``` ### LLM Performance **DeepSeek Coder v2 Lite Instruct Q4_K_M:** - **Context Window**: 32,768 tokens - **Parameters**: 2.4B (quantized to ~1.5GB) - **Speed**: ~40-60 tokens/sec (M2 Max) - **Quality**: Comparable to 7B models due to architecture **Memory Requirements:** - **Minimum**: 8GB unified memory (model + context) - **Recommended**: 16GB+ (for comfortable multi-tasking) - **Optimal**: 24GB+ (for large context windows) ## Troubleshooting ### Issue: "Connection refused" error **Solution:** ```bash # Verify LM Studio server is running curl http://localhost:1234/v1/models # If not running, start LM Studio and: # 1. Go to "Local Server" tab # 2. Load embedding model # 3. Click "Start Server" ``` ### Issue: Slow embedding generation **Solution:** ```bash # 1. Verify MLX is being used (LM Studio should show "MLX" in UI) # 2. Reduce batch size for lower memory systems RUST_LOG=info codegraph index . --batch-size 32 # 3. Close other applications to free up unified memory ``` ### Issue: "Model not found" error **Solution:** ```bash # Download models via LM Studio UI or CLI # Then verify in LM Studio → Models tab # Or specify exact model path: export CODEGRAPH_EMBEDDING_MODEL="jinaai/jina-embeddings-v3" codegraph index . ``` ### Issue: Too many logs cluttering output **Solution:** ```bash # Set RUST_LOG=warn for clean TUI output export RUST_LOG=warn codegraph index . # Or add to .env file: echo "RUST_LOG=warn" >> .env ``` ## Comparison: LM Studio vs Ollama | Feature | LM Studio | Ollama | |---------|-----------|---------| | **MLX Support** | Native (10x faster) | Basic | | **Flash Attention 2** | Yes (2-3x faster) | No | | **Quantization** | Q4_K_M (better) | Q4 (standard) | | **Setup** | GUI + API | CLI only | | **Model Management** | Excellent UI | Good CLI | | **API** | OpenAI-compatible | Custom | | **Performance (M2 Max)** | ~120 emb/sec | ~60 emb/sec | | **Memory Efficiency** | Excellent | Good | | **Best For** | Production, macOS | Development, Linux | ## Alternative: Ollama Setup If you prefer Ollama: ```bash # Install Ollama brew install ollama # Pull models ollama pull nomic-embed-code # Embeddings ollama pull qwen2.5-coder:14b # LLM # Configure CodeGraph export CODEGRAPH_EMBEDDING_PROVIDER=ollama export CODEGRAPH_EMBEDDING_MODEL=nomic-embed-code export CODEGRAPH_OLLAMA_URL=http://localhost:11434 export CODEGRAPH_EMBEDDING_DIMENSION=384 # nomic-embed-code uses 384-dim # Index codegraph index /path/to/project ``` ## Best Practices ### 1. Clean TUI Output ```bash # Always use RUST_LOG=warn during indexing export RUST_LOG=warn codegraph index . ``` ### 2. Batch Size Tuning ```bash # Start with default (64), adjust based on memory codegraph index . --batch-size 64 # Monitor embedding speed in output: # ⚡ 127.3 embeddings/sec (good) # ⚡ 45.2 embeddings/sec (increase batch size if memory allows) # ⚡ OOM error (decrease batch size) ``` ### 3. Multi-Project Workflow ```bash # Each project gets isolated .codegraph/ storage cd ~/projects/api && codegraph index . cd ~/projects/frontend && codegraph index . cd ~/projects/mobile && codegraph index . # Serve different projects in different terminals cd ~/projects/api && codegraph start stdio # Terminal 1 cd ~/projects/frontend && codegraph start stdio # Terminal 2 ``` ### 4. Claude Desktop Integration ```json { "mcpServers": { "codegraph-api": { "command": "codegraph", "args": ["start", "stdio"], "cwd": "/Users/you/projects/api", "env": { "RUST_LOG": "warn", "CODEGRAPH_EMBEDDING_PROVIDER": "lmstudio", "CODEGRAPH_LMSTUDIO_URL": "http://localhost:1234" } } } } ``` ## Model Recommendations ### Embeddings | Model | Dimensions | Size | Quality | Speed | |-------|-----------|------|---------|-------| | **jina-code-embeddings-1.5b** | 1536 | 1.5GB | Excellent | Fast | | jinaai/jina-embeddings-v2-base-code | 768 | 500MB | Very Good | Very Fast | | nomic-embed-code (Ollama) | 384 | 300MB | Good | Very Fast | | all-MiniLM-L6-v2 (ONNX) | 384 | 90MB | Good | Ultra Fast | **Recommendation**: jina-code-embeddings-1.5b (best quality/speed tradeoff) ### LLMs | Model | Context | Size | Quality | Speed | |-------|---------|------|---------|-------| | **DeepSeek Coder v2 Lite Q4_K_M** | 32K | 1.5GB | Excellent | Fast | | Qwen2.5-Coder 7B Q4 | 128K | 4GB | Excellent | Medium | | CodeLlama 13B Q4 | 16K | 7GB | Very Good | Slow | | DeepSeek Coder 6.7B Q4 | 16K | 4GB | Very Good | Medium | **Recommendation**: DeepSeek Coder v2 Lite (best for most use cases) ## Support For issues or questions: - GitHub Issues: [codegraph-rust/issues](https://github.com/Jakedismo/codegraph-rust/issues) - Documentation: [docs/](../docs/) - LM Studio Docs: [lmstudio.ai/docs](https://lmstudio.ai/docs)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Jakedismo/codegraph-rust'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

LMSTUDIO_SETUP.md•11.8 KiB