finetune_model
Optimize and fine-tune large language models using Unsloth enhancements for faster training and reduced memory usage. Specify model, dataset, and parameters like LoRA rank, batch size, and learning rate for efficient customization.
Instructions
Fine-tune a model with Unsloth optimizations
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| batch_size | No | Batch size for training | |
| dataset_name | Yes | Name of the dataset to use for fine-tuning | |
| dataset_text_field | No | Field in the dataset containing the text | |
| gradient_accumulation_steps | No | Number of gradient accumulation steps | |
| learning_rate | No | Learning rate for training | |
| load_in_4bit | No | Whether to use 4-bit quantization | |
| lora_alpha | No | Alpha for LoRA fine-tuning | |
| lora_rank | No | Rank for LoRA fine-tuning | |
| max_seq_length | No | Maximum sequence length for training | |
| max_steps | No | Maximum number of training steps | |
| model_name | Yes | Name of the model to fine-tune | |
| output_dir | Yes | Directory to save the fine-tuned model |
Implementation Reference
- src/index.ts:363-483 (handler)The handler for the 'finetune_model' tool. It destructures the input arguments, constructs a comprehensive Python script that loads a model with Unsloth, applies LoRA, sets up SFTTrainer with datasets, trains the model, saves it, and returns JSON status. The script is executed via executeUnslothScript and results are parsed and returned.case 'finetune_model': { const { model_name, dataset_name, output_dir, max_seq_length = 2048, lora_rank = 16, lora_alpha = 16, batch_size = 2, gradient_accumulation_steps = 4, learning_rate = 2e-4, max_steps = 100, dataset_text_field = 'text', load_in_4bit = true, } = args as { model_name: string; dataset_name: string; output_dir: string; max_seq_length?: number; lora_rank?: number; lora_alpha?: number; batch_size?: number; gradient_accumulation_steps?: number; learning_rate?: number; max_steps?: number; dataset_text_field?: string; load_in_4bit?: boolean; }; const script = ` import json import os try: from unsloth import FastLanguageModel from datasets import load_dataset from trl import SFTTrainer, SFTConfig import torch # Create output directory if it doesn't exist os.makedirs("${output_dir}", exist_ok=True) # Load the model model, tokenizer = FastLanguageModel.from_pretrained( model_name="${model_name}", max_seq_length=${max_seq_length}, load_in_4bit=${load_in_4bit ? 'True' : 'False'}, use_gradient_checkpointing="unsloth" ) # Load the dataset dataset = load_dataset("${dataset_name}") # Patch the model with LoRA model = FastLanguageModel.get_peft_model( model, r=${lora_rank}, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha=${lora_alpha}, use_gradient_checkpointing="unsloth", random_state=3407, max_seq_length=${max_seq_length}, use_rslora=False, loftq_config=None ) # Configure the trainer trainer = SFTTrainer( model=model, train_dataset=dataset["train"], tokenizer=tokenizer, args=SFTConfig( dataset_text_field="${dataset_text_field}", max_seq_length=${max_seq_length}, per_device_train_batch_size=${batch_size}, gradient_accumulation_steps=${gradient_accumulation_steps}, warmup_steps=10, max_steps=${max_steps}, learning_rate=${learning_rate}, logging_steps=1, output_dir="${output_dir}", optim="adamw_8bit", seed=3407, ), ) # Train the model trainer.train() # Save the model trainer.save_model() print(json.dumps({ "success": True, "output_dir": "${output_dir}", "model_name": "${model_name}", "dataset_name": "${dataset_name}", "max_steps": ${max_steps} })) except Exception as e: print(json.dumps({"error": str(e), "success": False})) `; const result = await this.executeUnslothScript(script); try { const trainingResult = JSON.parse(result); if (!trainingResult.success) { throw new Error(trainingResult.error); } return { content: [ { type: 'text', text: `Successfully fine-tuned model: ${model_name} with dataset: ${dataset_name}\n\n${JSON.stringify(trainingResult, null, 2)}`, }, ], }; } catch (error: any) { throw new Error(`Error fine-tuning model: ${error.message}`); } }
- src/index.ts:115-167 (schema)Input schema for the 'finetune_model' tool, defining all parameters with types, descriptions, and required fields.inputSchema: { type: 'object', properties: { model_name: { type: 'string', description: 'Name of the model to fine-tune', }, dataset_name: { type: 'string', description: 'Name of the dataset to use for fine-tuning', }, output_dir: { type: 'string', description: 'Directory to save the fine-tuned model', }, max_seq_length: { type: 'number', description: 'Maximum sequence length for training', }, lora_rank: { type: 'number', description: 'Rank for LoRA fine-tuning', }, lora_alpha: { type: 'number', description: 'Alpha for LoRA fine-tuning', }, batch_size: { type: 'number', description: 'Batch size for training', }, gradient_accumulation_steps: { type: 'number', description: 'Number of gradient accumulation steps', }, learning_rate: { type: 'number', description: 'Learning rate for training', }, max_steps: { type: 'number', description: 'Maximum number of training steps', }, dataset_text_field: { type: 'string', description: 'Field in the dataset containing the text', }, load_in_4bit: { type: 'boolean', description: 'Whether to use 4-bit quantization', }, }, required: ['model_name', 'dataset_name', 'output_dir'],
- src/index.ts:112-169 (registration)Registration of the 'finetune_model' tool in the ListToolsRequestSchema response, including name, description, and input schema.{ name: 'finetune_model', description: 'Fine-tune a model with Unsloth optimizations', inputSchema: { type: 'object', properties: { model_name: { type: 'string', description: 'Name of the model to fine-tune', }, dataset_name: { type: 'string', description: 'Name of the dataset to use for fine-tuning', }, output_dir: { type: 'string', description: 'Directory to save the fine-tuned model', }, max_seq_length: { type: 'number', description: 'Maximum sequence length for training', }, lora_rank: { type: 'number', description: 'Rank for LoRA fine-tuning', }, lora_alpha: { type: 'number', description: 'Alpha for LoRA fine-tuning', }, batch_size: { type: 'number', description: 'Batch size for training', }, gradient_accumulation_steps: { type: 'number', description: 'Number of gradient accumulation steps', }, learning_rate: { type: 'number', description: 'Learning rate for training', }, max_steps: { type: 'number', description: 'Maximum number of training steps', }, dataset_text_field: { type: 'string', description: 'Field in the dataset containing the text', }, load_in_4bit: { type: 'boolean', description: 'Whether to use 4-bit quantization', }, }, required: ['model_name', 'dataset_name', 'output_dir'], }, },