Skip to main content
Glama
OtotaO
by OtotaO

finetune_model

Optimize and fine-tune large language models using Unsloth enhancements for faster training and reduced memory usage. Specify model, dataset, and parameters like LoRA rank, batch size, and learning rate for efficient customization.

Instructions

Fine-tune a model with Unsloth optimizations

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
batch_sizeNoBatch size for training
dataset_nameYesName of the dataset to use for fine-tuning
dataset_text_fieldNoField in the dataset containing the text
gradient_accumulation_stepsNoNumber of gradient accumulation steps
learning_rateNoLearning rate for training
load_in_4bitNoWhether to use 4-bit quantization
lora_alphaNoAlpha for LoRA fine-tuning
lora_rankNoRank for LoRA fine-tuning
max_seq_lengthNoMaximum sequence length for training
max_stepsNoMaximum number of training steps
model_nameYesName of the model to fine-tune
output_dirYesDirectory to save the fine-tuned model

Implementation Reference

  • The handler for the 'finetune_model' tool. It destructures the input arguments, constructs a comprehensive Python script that loads a model with Unsloth, applies LoRA, sets up SFTTrainer with datasets, trains the model, saves it, and returns JSON status. The script is executed via executeUnslothScript and results are parsed and returned.
    case 'finetune_model': { const { model_name, dataset_name, output_dir, max_seq_length = 2048, lora_rank = 16, lora_alpha = 16, batch_size = 2, gradient_accumulation_steps = 4, learning_rate = 2e-4, max_steps = 100, dataset_text_field = 'text', load_in_4bit = true, } = args as { model_name: string; dataset_name: string; output_dir: string; max_seq_length?: number; lora_rank?: number; lora_alpha?: number; batch_size?: number; gradient_accumulation_steps?: number; learning_rate?: number; max_steps?: number; dataset_text_field?: string; load_in_4bit?: boolean; }; const script = ` import json import os try: from unsloth import FastLanguageModel from datasets import load_dataset from trl import SFTTrainer, SFTConfig import torch # Create output directory if it doesn't exist os.makedirs("${output_dir}", exist_ok=True) # Load the model model, tokenizer = FastLanguageModel.from_pretrained( model_name="${model_name}", max_seq_length=${max_seq_length}, load_in_4bit=${load_in_4bit ? 'True' : 'False'}, use_gradient_checkpointing="unsloth" ) # Load the dataset dataset = load_dataset("${dataset_name}") # Patch the model with LoRA model = FastLanguageModel.get_peft_model( model, r=${lora_rank}, target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha=${lora_alpha}, use_gradient_checkpointing="unsloth", random_state=3407, max_seq_length=${max_seq_length}, use_rslora=False, loftq_config=None ) # Configure the trainer trainer = SFTTrainer( model=model, train_dataset=dataset["train"], tokenizer=tokenizer, args=SFTConfig( dataset_text_field="${dataset_text_field}", max_seq_length=${max_seq_length}, per_device_train_batch_size=${batch_size}, gradient_accumulation_steps=${gradient_accumulation_steps}, warmup_steps=10, max_steps=${max_steps}, learning_rate=${learning_rate}, logging_steps=1, output_dir="${output_dir}", optim="adamw_8bit", seed=3407, ), ) # Train the model trainer.train() # Save the model trainer.save_model() print(json.dumps({ "success": True, "output_dir": "${output_dir}", "model_name": "${model_name}", "dataset_name": "${dataset_name}", "max_steps": ${max_steps} })) except Exception as e: print(json.dumps({"error": str(e), "success": False})) `; const result = await this.executeUnslothScript(script); try { const trainingResult = JSON.parse(result); if (!trainingResult.success) { throw new Error(trainingResult.error); } return { content: [ { type: 'text', text: `Successfully fine-tuned model: ${model_name} with dataset: ${dataset_name}\n\n${JSON.stringify(trainingResult, null, 2)}`, }, ], }; } catch (error: any) { throw new Error(`Error fine-tuning model: ${error.message}`); } }
  • Input schema for the 'finetune_model' tool, defining all parameters with types, descriptions, and required fields.
    inputSchema: { type: 'object', properties: { model_name: { type: 'string', description: 'Name of the model to fine-tune', }, dataset_name: { type: 'string', description: 'Name of the dataset to use for fine-tuning', }, output_dir: { type: 'string', description: 'Directory to save the fine-tuned model', }, max_seq_length: { type: 'number', description: 'Maximum sequence length for training', }, lora_rank: { type: 'number', description: 'Rank for LoRA fine-tuning', }, lora_alpha: { type: 'number', description: 'Alpha for LoRA fine-tuning', }, batch_size: { type: 'number', description: 'Batch size for training', }, gradient_accumulation_steps: { type: 'number', description: 'Number of gradient accumulation steps', }, learning_rate: { type: 'number', description: 'Learning rate for training', }, max_steps: { type: 'number', description: 'Maximum number of training steps', }, dataset_text_field: { type: 'string', description: 'Field in the dataset containing the text', }, load_in_4bit: { type: 'boolean', description: 'Whether to use 4-bit quantization', }, }, required: ['model_name', 'dataset_name', 'output_dir'],
  • src/index.ts:112-169 (registration)
    Registration of the 'finetune_model' tool in the ListToolsRequestSchema response, including name, description, and input schema.
    { name: 'finetune_model', description: 'Fine-tune a model with Unsloth optimizations', inputSchema: { type: 'object', properties: { model_name: { type: 'string', description: 'Name of the model to fine-tune', }, dataset_name: { type: 'string', description: 'Name of the dataset to use for fine-tuning', }, output_dir: { type: 'string', description: 'Directory to save the fine-tuned model', }, max_seq_length: { type: 'number', description: 'Maximum sequence length for training', }, lora_rank: { type: 'number', description: 'Rank for LoRA fine-tuning', }, lora_alpha: { type: 'number', description: 'Alpha for LoRA fine-tuning', }, batch_size: { type: 'number', description: 'Batch size for training', }, gradient_accumulation_steps: { type: 'number', description: 'Number of gradient accumulation steps', }, learning_rate: { type: 'number', description: 'Learning rate for training', }, max_steps: { type: 'number', description: 'Maximum number of training steps', }, dataset_text_field: { type: 'string', description: 'Field in the dataset containing the text', }, load_in_4bit: { type: 'boolean', description: 'Whether to use 4-bit quantization', }, }, required: ['model_name', 'dataset_name', 'output_dir'], }, },

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/OtotaO/unsloth-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server