Skip to main content
Glama
FINE_TUNING.md9.22 kB
# Fine-Tuning Pipeline for ZigNet ## Overview This document outlines the process for fine-tuning a code LLM specifically for Zig language expertise to power ZigNet's `get_zig_docs` and `suggest_fix` tools. ## Model Selection Based on benchmark results from `scripts/test-results/`: | Model | Pass Rate | Avg Time | Recommendation | | -------------------- | ---------- | -------- | ------------------- | | **Qwen2.5-Coder-7B** | 6/6 (100%) | 29.58s | ✅ **BEST CHOICE** | | DeepSeek-Coder-6.7B | 6/6 (100%) | 27.86s | ✅ Good alternative | | CodeLlama-7B | 6/6 (100%) | ~30s | ✅ Viable option | | Llama3.2-3B | 6/6 (100%) | ~28s | ⚠️ Smaller, faster | | Mistral-7B | Variable | ~30s | ⚠️ Less code-focused | **Selected**: **Qwen2.5-Coder-7B** - Excellent Zig understanding out-of-the-box - Strong code generation capabilities - Good balance of size/performance - Active development and support --- ## Phase 1: Data Collection ### 1.1 Run Documentation Scraper ```bash cd /home/fulgidus/Projects/zignet node scripts/scrape-zig-docs.js ``` **Output**: - `data/zig-docs/zig-0.15.0-dataset.json` - `data/zig-docs/zig-0.14.0-dataset.json` - `data/zig-docs/zig-0.13.0-dataset.json` - `data/zig-docs/zig-combined-dataset.json` - `data/zig-docs/dataset-stats.json` ### 1.2 Dataset Format ```json { "instruction": "Write Zig 0.15 code for: Error Handling", "context": "Zig uses error unions for explicit error handling...", "response": "const MyError = error{InvalidInput};\n\nfn divide(a: i32, b: i32) !f32 {\n if (b == 0) return error.DivisionByZero;\n return @intToFloat(f32, a) / @intToFloat(f32, b);\n}", "metadata": { "version": "0.15.0", "topic": "Error Handling", "difficulty": "medium" } } ``` ### 1.3 Dataset Augmentation Create additional examples from: 1. **Zig Standard Library**: Extract common patterns 2. **Community projects**: Parse popular Zig repos 3. **Synthetic examples**: Generate variants of existing code 4. **Error correction pairs**: Common mistakes → fixes **Target**: 10,000+ high-quality instruction-response pairs --- ## Phase 2: Dataset Preparation ### 2.1 Clean and Validate ```bash # Create cleaning script node scripts/clean-dataset.js ``` Tasks: - Remove duplicates - Validate Zig syntax (use our parser!) - Filter out deprecated syntax - Balance difficulty levels - Ensure version coverage ### 2.2 Split Dataset ``` Training: 70% (7,000 examples) Validation: 15% (1,500 examples) Test: 15% (1,500 examples) ``` ### 2.3 Convert to Training Format For **LoRA fine-tuning** (recommended for efficiency): ```json { "text": "<|system|>\nYou are a Zig programming expert for version 0.15.0.\n<|user|>\nWrite Zig code for: Error Handling\nContext: Zig uses error unions for explicit error handling...\n<|assistant|>\nconst MyError = error{InvalidInput};\n\nfn divide(a: i32, b: i32) !f32 {\n if (b == 0) return error.DivisionByZero;\n return @intToFloat(f32, a) / @intToFloat(f32, b);\n}" } ``` --- ## Phase 3: Fine-Tuning ### 3.1 Environment Setup ```bash # Install dependencies pip install torch transformers peft datasets bitsandbytes accelerate # Or use Google Colab with GPU ``` ### 3.2 Training Configuration **Method**: QLoRA (Quantized Low-Rank Adaptation) - Memory efficient (fits on 24GB GPU) - Fast training (few hours) - Preserves base model quality **Hyperparameters**: ```python training_args = { "model_name": "Qwen/Qwen2.5-Coder-7B-Instruct", "lora_r": 16, "lora_alpha": 32, "lora_dropout": 0.05, "learning_rate": 2e-4, "num_epochs": 3, "batch_size": 4, "gradient_accumulation_steps": 4, "warmup_steps": 100, "max_seq_length": 2048, } ``` ### 3.3 Training Script ```python # scripts/train-zig-model.py from transformers import AutoModelForCausalLM, AutoTokenizer from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training from datasets import load_dataset import torch # Load base model model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen2.5-Coder-7B-Instruct", load_in_4bit=True, device_map="auto", ) # Configure LoRA lora_config = LoraConfig( r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", ) model = prepare_model_for_kbit_training(model) model = get_peft_model(model, lora_config) # Load dataset dataset = load_dataset("json", data_files="data/zig-docs/zig-combined-dataset.json") # Train trainer = Trainer( model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["validation"], ) trainer.train() ``` ### 3.4 Training Monitoring - Loss curve: Should decrease smoothly - Validation metrics: Monitor perplexity - Sample generation: Test on held-out examples - Early stopping: If validation loss plateaus --- ## Phase 4: Model Validation ### 4.1 Automated Tests Run benchmark suite: ```bash node scripts/test-model-advanced.js fulgidus/zignet-qwen-7b ``` **Success Criteria**: - Pass rate: ≥ 95% (better than base model) - Response time: < 30s average - Zig syntax validity: 100% - Version-specific accuracy: > 90% ### 4.2 Manual Review Test edge cases: - Comptime evaluation - Generic functions - Error unions - Async/await - Build system features --- ## Phase 5: HuggingFace Upload ### 5.1 Prepare Model Card ```markdown --- language: - zig license: wtfpl tags: - code - zig - programming - fine-tuned base_model: Qwen/Qwen2.5-Coder-7B-Instruct --- # ZigNet Qwen 7B - Zig Language Expert Fine-tuned version of Qwen2.5-Coder-7B specifically for Zig programming language (versions 0.13-0.15). ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("fulgidus/zignet-qwen-7b") tokenizer = AutoTokenizer.from_pretrained("fulgidus/zignet-qwen-7b") prompt = "Write a Zig function that handles errors" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=512) print(tokenizer.decode(outputs[0])) ``` ## Training Data - 10,000+ Zig code examples from official documentation - Covers Zig 0.13, 0.14, 0.15 - Balanced across difficulty levels and topics ## Performance - Benchmark: 100% pass rate on Zig-specific tasks - Syntax accuracy: 100% - Average response time: 25s ``` ### 5.2 Upload ```bash huggingface-cli login huggingface-cli repo create zignet-qwen-7b --type model python scripts/upload-model.py ``` --- ## Phase 6: Integration with ZigNet ### 6.1 Update node-llama-cpp Integration ```typescript // src/llm/zig-expert.ts import { LlamaModel, LlamaContext } from 'node-llama-cpp'; export class ZigExpertLLM { private model: LlamaModel; private context: LlamaContext; async init() { // Download from HuggingFace this.model = await LlamaModel.load({ modelPath: 'models/zignet-qwen-7b.gguf', // Convert to GGUF gpuLayers: 33, // Adjust based on GPU }); this.context = await this.model.createContext(); } async getDocs(topic: string): Promise<string> { const prompt = `Explain ${topic} in Zig 0.15`; return await this.context.evaluate(prompt); } async suggestFix(error: string, code: string): Promise<string> { const prompt = `Fix this Zig error:\n${error}\n\nCode:\n${code}`; return await this.context.evaluate(prompt); } } ``` ### 6.2 Model Format Conversion Convert to GGUF for node-llama-cpp: ```bash python scripts/convert-to-gguf.py \ --model fulgidus/zignet-qwen-7b \ --output models/zignet-qwen-7b.gguf \ --quantize q4_k_m ``` --- ## Maintenance & Updates ### When New Zig Version Released 1. **Update scraper** with new version URL 2. **Run scraper**: `node scripts/scrape-zig-docs.js` 3. **Merge datasets**: Combine with existing data 4. **Incremental training**: Fine-tune on new data only (faster) 5. **Validate**: Run benchmark suite 6. **Upload**: Push to HuggingFace 7. **Update ZigNet**: Download new model ### Monitoring Model Quality - Track user feedback in MCP conversations - Log failed generations - Periodic re-evaluation on test set - Version-specific accuracy tracking --- ## Cost & Resources ### Training Costs (Estimated) - **Google Colab Pro**: ~$10/month (A100 GPU) - **RunPod/Lambda**: ~$1/hour (A100 80GB) - **Training time**: 3-6 hours - **Total cost**: < $20 per iteration ### Storage - Base model: ~15GB - Fine-tuned model: +500MB (LoRA adapters) - GGUF quantized: ~4GB - Dataset: ~50MB ### Inference - GPU: RTX 3060 (12GB) or better - CPU: Possible but slow (2-3 minutes/response) - Recommended: M1/M2 Mac or modern GPU --- ## Alternative: Distillation (Future) If fine-tuned model is too large: 1. Use fine-tuned Qwen-7B as **teacher** 2. Distill to **Phi-2.7B** (student) 3. Result: Faster, smaller model (3GB vs 15GB) 4. Trade-off: Slight quality decrease --- ## Resources - [Qwen2.5-Coder](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) - [PEFT Documentation](https://huggingface.co/docs/peft) - [Zig Documentation](https://ziglang.org/documentation/) - [node-llama-cpp](https://github.com/withcatai/node-llama-cpp)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/fulgidus/zignet'

If you have feedback or need assistance with the MCP directory API, please join our Discord server