Zignet

Overview Schema Related Servers Score Discussions

zignet
docs

FINE_TUNING.md•9.01 KiB

# Fine-Tuning Pipeline for ZigNet ## Overview This document outlines the process for fine-tuning a code LLM specifically for Zig language expertise to power ZigNet's `get_zig_docs` and `suggest_fix` tools. ## Model Selection Based on benchmark results from `scripts/test-results/`: | Model | Pass Rate | Avg Time | Recommendation | | -------------------- | ---------- | -------- | ------------------- | | **Qwen2.5-Coder-7B** | 6/6 (100%) | 29.58s | ✅ **BEST CHOICE** | | DeepSeek-Coder-6.7B | 6/6 (100%) | 27.86s | ✅ Good alternative | | CodeLlama-7B | 6/6 (100%) | ~30s | ✅ Viable option | | Llama3.2-3B | 6/6 (100%) | ~28s | ⚠️ Smaller, faster | | Mistral-7B | Variable | ~30s | ⚠️ Less code-focused | **Selected**: **Qwen2.5-Coder-7B** - Excellent Zig understanding out-of-the-box - Strong code generation capabilities - Good balance of size/performance - Active development and support --- ## Phase 1: Data Collection ### 1.1 Run Documentation Scraper ```bash cd /home/fulgidus/Projects/zignet node scripts/scrape-zig-docs.js ``` **Output**: - `data/zig-docs/zig-0.15.0-dataset.json` - `data/zig-docs/zig-0.14.0-dataset.json` - `data/zig-docs/zig-0.13.0-dataset.json` - `data/zig-docs/zig-combined-dataset.json` - `data/zig-docs/dataset-stats.json` ### 1.2 Dataset Format ```json { "instruction": "Write Zig 0.15 code for: Error Handling", "context": "Zig uses error unions for explicit error handling...", "response": "const MyError = error{InvalidInput};\n\nfn divide(a: i32, b: i32) !f32 {\n if (b == 0) return error.DivisionByZero;\n return @intToFloat(f32, a) / @intToFloat(f32, b);\n}", "metadata": { "version": "0.15.0", "topic": "Error Handling", "difficulty": "medium" } } ``` ### 1.3 Dataset Augmentation Create additional examples from: 1. **Zig Standard Library**: Extract common patterns 2. **Community projects**: Parse popular Zig repos 3. **Synthetic examples**: Generate variants of existing code 4. **Error correction pairs**: Common mistakes → fixes **Target**: 10,000+ high-quality instruction-response pairs --- ## Phase 2: Dataset Preparation ### 2.1 Clean and Validate ```bash # Create cleaning script node scripts/clean-dataset.js ``` Tasks: - Remove duplicates - Validate Zig syntax (use our parser!) - Filter out deprecated syntax - Balance difficulty levels - Ensure version coverage ### 2.2 Split Dataset ``` Training: 70% (7,000 examples) Validation: 15% (1,500 examples) Test: 15% (1,500 examples) ``` ### 2.3 Convert to Training Format For **LoRA fine-tuning** (recommended for efficiency): ```json { "text": "<|system|>\nYou are a Zig programming expert for version 0.15.0.\n<|user|>\nWrite Zig code for: Error Handling\nContext: Zig uses error unions for explicit error handling...\n<|assistant|>\nconst MyError = error{InvalidInput};\n\nfn divide(a: i32, b: i32) !f32 {\n if (b == 0) return error.DivisionByZero;\n return @intToFloat(f32, a) / @intToFloat(f32, b);\n}" } ``` --- ## Phase 3: Fine-Tuning ### 3.1 Environment Setup ```bash # Install dependencies pip install torch transformers peft datasets bitsandbytes accelerate # Or use Google Colab with GPU ``` ### 3.2 Training Configuration **Method**: QLoRA (Quantized Low-Rank Adaptation) - Memory efficient (fits on 24GB GPU) - Fast training (few hours) - Preserves base model quality **Hyperparameters**: ```python training_args = { "model_name": "Qwen/Qwen2.5-Coder-7B-Instruct", "lora_r": 16, "lora_alpha": 32, "lora_dropout": 0.05, "learning_rate": 2e-4, "num_epochs": 3, "batch_size": 4, "gradient_accumulation_steps": 4, "warmup_steps": 100, "max_seq_length": 2048, } ``` ### 3.3 Training Script ```python # scripts/train-zig-model.py from transformers import AutoModelForCausalLM, AutoTokenizer from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training from datasets import load_dataset import torch # Load base model model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen2.5-Coder-7B-Instruct", load_in_4bit=True, device_map="auto", ) # Configure LoRA lora_config = LoraConfig( r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", ) model = prepare_model_for_kbit_training(model) model = get_peft_model(model, lora_config) # Load dataset dataset = load_dataset("json", data_files="data/zig-docs/zig-combined-dataset.json") # Train trainer = Trainer( model=model, args=training_args, train_dataset=dataset["train"], eval_dataset=dataset["validation"], ) trainer.train() ``` ### 3.4 Training Monitoring - Loss curve: Should decrease smoothly - Validation metrics: Monitor perplexity - Sample generation: Test on held-out examples - Early stopping: If validation loss plateaus --- ## Phase 4: Model Validation ### 4.1 Automated Tests Run benchmark suite: ```bash node scripts/test-model-advanced.js fulgidus/zignet-qwen-7b ``` **Success Criteria**: - Pass rate: ≥ 95% (better than base model) - Response time: < 30s average - Zig syntax validity: 100% - Version-specific accuracy: > 90% ### 4.2 Manual Review Test edge cases: - Comptime evaluation - Generic functions - Error unions - Async/await - Build system features --- ## Phase 5: HuggingFace Upload ### 5.1 Prepare Model Card ```markdown --- language: - zig license: wtfpl tags: - code - zig - programming - fine-tuned base_model: Qwen/Qwen2.5-Coder-7B-Instruct --- # ZigNet Qwen 7B - Zig Language Expert Fine-tuned version of Qwen2.5-Coder-7B specifically for Zig programming language (versions 0.13-0.15). ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("fulgidus/zignet-qwen-7b") tokenizer = AutoTokenizer.from_pretrained("fulgidus/zignet-qwen-7b") prompt = "Write a Zig function that handles errors" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=512) print(tokenizer.decode(outputs[0])) ``` ## Training Data - 10,000+ Zig code examples from official documentation - Covers Zig 0.13, 0.14, 0.15 - Balanced across difficulty levels and topics ## Performance - Benchmark: 100% pass rate on Zig-specific tasks - Syntax accuracy: 100% - Average response time: 25s ``` ### 5.2 Upload ```bash huggingface-cli login huggingface-cli repo create zignet-qwen-7b --type model python scripts/upload-model.py ``` --- ## Phase 6: Integration with ZigNet ### 6.1 Update node-llama-cpp Integration ```typescript // src/llm/zig-expert.ts import { LlamaModel, LlamaContext } from 'node-llama-cpp'; export class ZigExpertLLM { private model: LlamaModel; private context: LlamaContext; async init() { // Download from HuggingFace this.model = await LlamaModel.load({ modelPath: 'models/zignet-qwen-7b.gguf', // Convert to GGUF gpuLayers: 33, // Adjust based on GPU }); this.context = await this.model.createContext(); } async getDocs(topic: string): Promise<string> { const prompt = `Explain ${topic} in Zig 0.15`; return await this.context.evaluate(prompt); } async suggestFix(error: string, code: string): Promise<string> { const prompt = `Fix this Zig error:\n${error}\n\nCode:\n${code}`; return await this.context.evaluate(prompt); } } ``` ### 6.2 Model Format Conversion Convert to GGUF for node-llama-cpp: ```bash python scripts/convert-to-gguf.py \ --model fulgidus/zignet-qwen-7b \ --output models/zignet-qwen-7b.gguf \ --quantize q4_k_m ``` --- ## Maintenance & Updates ### When New Zig Version Released 1. **Update scraper** with new version URL 2. **Run scraper**: `node scripts/scrape-zig-docs.js` 3. **Merge datasets**: Combine with existing data 4. **Incremental training**: Fine-tune on new data only (faster) 5. **Validate**: Run benchmark suite 6. **Upload**: Push to HuggingFace 7. **Update ZigNet**: Download new model ### Monitoring Model Quality - Track user feedback in MCP conversations - Log failed generations - Periodic re-evaluation on test set - Version-specific accuracy tracking --- ## Cost & Resources ### Training Costs (Estimated) - **Google Colab Pro**: ~$10/month (A100 GPU) - **RunPod/Lambda**: ~$1/hour (A100 80GB) - **Training time**: 3-6 hours - **Total cost**: < $20 per iteration ### Storage - Base model: ~15GB - Fine-tuned model: +500MB (LoRA adapters) - GGUF quantized: ~4GB - Dataset: ~50MB ### Inference - GPU: RTX 3060 (12GB) or better - CPU: Possible but slow (2-3 minutes/response) - Recommended: M1/M2 Mac or modern GPU --- ## Alternative: Distillation (Future) If fine-tuned model is too large: 1. Use fine-tuned Qwen-7B as **teacher** 2. Distill to **Phi-2.7B** (student) 3. Result: Faster, smaller model (3GB vs 15GB) 4. Trade-off: Slight quality decrease --- ## Resources - [Qwen2.5-Coder](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) - [PEFT Documentation](https://huggingface.co/docs/peft) - [Zig Documentation](https://ziglang.org/documentation/) - [node-llama-cpp](https://github.com/withcatai/node-llama-cpp)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/fulgidus/zignet'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

FINE_TUNING.md•9.01 KiB