create_job
Fine-tune an LLM on a GitHub repository to learn code patterns and conventions. Choose a training agent: Cody for code autocomplete or SIERA for bug-fix specialization.
Instructions
Fine-tune an LLM on a GitHub repository using Tuning Engines. This trains a custom model that learns from the code patterns, style, and conventions in the repo. Choose an agent to control the training approach:
AVAILABLE AGENTS:
agent='code_repo' (Cody) — LoRA-based code fine-tuning using QLoRA (4-bit quantized LoRA) via the Axolotl framework. Trains on your repo's code patterns, naming conventions, and project structure to produce a fast, lightweight adapter. Best for: code autocomplete, inline suggestions, tab-complete, code style matching.
agent='sera_code_repo' (SIERA) — Bug-fix specialist using the Open Coding Agents approach from AllenAI. Generates synthetic error-resolution training pairs from your repo, producing a model that understands your codebase's failure patterns and fix conventions. Best for: debugging, error resolution, patch generation, root cause analysis. Supports quality_tier='low' (faster) or quality_tier='high' (deeper analysis, more training data).
SUPPORTED BASE MODELS (by size):
3B: Qwen/Qwen2.5-Coder-3B-Instruct
7-8B: codellama/CodeLlama-7b-hf, deepseek-ai/deepseek-coder-7b-instruct-v1.5, Qwen/Qwen2.5-Coder-7B-Instruct, Qwen/Qwen3-8B
13-15B: codellama/CodeLlama-13b-Instruct-hf, bigcode/starcoder2-15b, Qwen/Qwen2.5-Coder-14B-Instruct, Qwen/Qwen3-14B
22-27B: mistralai/Codestral-22B-v0.1, google/gemma-2-27b
30-34B: deepseek-ai/deepseek-coder-33b-instruct, codellama/CodeLlama-34b-Instruct-hf, Qwen/Qwen2.5-Coder-32B-Instruct, Qwen/Qwen3-Coder-30B-A3B, Qwen/Qwen3-32B
70-72B: codellama/CodeLlama-70b-Instruct-hf, meta-llama/Llama-3.1-70B-Instruct, Qwen/Qwen2.5-72B-Instruct
TYPICAL WORKFLOW: estimate_job first to check cost, then create_job, then job_status to monitor progress.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| base_model | No | HuggingFace model ID to fine-tune (e.g. 'Qwen/Qwen2.5-Coder-7B-Instruct'). Required unless base_user_model_id is provided. Use list_supported_models to see all options. | |
| base_user_model_id | No | ID of a previously trained model to fine-tune further (iterative training). The base model is resolved automatically. Use list_models to find IDs. | |
| output_name | Yes | Name for the resulting fine-tuned model (e.g. 'my-project-cody-7b') | |
| repo_url | Yes | GitHub repository URL to train on (e.g. 'https://github.com/org/repo') | |
| branch | No | Git branch to use (default: main) | |
| num_epochs | No | Number of training epochs (more = better quality but higher cost) | |
| max_examples | No | Maximum training examples to extract from the repo (minimum: 2) | |
| agent | No | Training agent to use. 'code_repo' (Cody) = QLoRA-based fine-tuning for code autocomplete and inline suggestions. 'sera_code_repo' (SIERA) = bug-fix specialist using AllenAI's Open Coding Agents approach. Default: 'code_repo'. | |
| quality_tier | No | Quality tier (SIERA agent only). 'low' = faster, fewer synthetic pairs. 'high' = deeper analysis, more training data, better results. Default: 'low'. | |
| s3_output_bucket | No | S3 bucket to export the trained model to. If omitted, model is stored in Tuning Engines cloud storage. | |
| s3_access_key_id | No | AWS access key ID for S3 export | |
| s3_secret_access_key | No | AWS secret access key for S3 export | |
| s3_region | No | AWS region for S3 export (e.g. us-east-1) |