アンスロートMCPサーバー

Unsloth用の MCP サーバー - メモリを 80% 削減しながら LLM の微調整を 2 倍高速化するライブラリ。

Unsloth とは何ですか?

Unsloth は、大規模な言語モデルの微調整の効率を劇的に向上させるライブラリです。

速度: 標準的な方法に比べて2倍高速な微調整
メモリ: VRAM使用量が80%削減され、コンシューマー向けGPUで大規模なモデルの微調整が可能
コンテキスト長: 最大13倍のコンテキスト長 (例: 80GB GPU 上の Llama 3.3 では 89K トークン)
精度: モデルの品質やパフォーマンスに損失はありません

Unsloth は、OpenAI の Triton 言語で記述されたカスタム CUDA カーネル、最適化されたバックプロパゲーション、および動的 4 ビット量子化を通じてこれらの改善を実現します。

特徴

Llama、Mistral、Phi、Gemmaなどのモデルの微調整を最適化
効率的なトレーニングのための4ビット量子化
拡張コンテキスト長のサポート
モデルの読み込み、微調整、推論のためのシンプルなAPI
さまざまな形式（GGUF、Hugging Face など）にエクスポート

クイックスタート

Unsloth をインストール: pip install unsloth
サーバーをインストールして構築します。
cd unsloth-server npm install npm run build
MCP設定に追加:
{ "mcpServers": { "unsloth-server": { "command": "node", "args": ["/path/to/unsloth-server/build/index.js"], "env": { "HUGGINGFACE_TOKEN": "your_token_here" // Optional }, "disabled": false, "autoApprove": [] } } }

利用可能なツール

インストールの確認

Unsloth がシステムに正しくインストールされているかどうかを確認します。

パラメータ: なし

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "check_installation",
  arguments: {}
});

サポートされているモデルの一覧

Llama、Mistral、Phi、Gemma のバリアントを含む、Unsloth でサポートされているすべてのモデルのリストを取得します。

パラメータ: なし

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "list_supported_models",
  arguments: {}
});

ロードモデル

推論と微調整を高速化するために、Unsloth 最適化を使用して事前トレーニング済みのモデルをロードします。

パラメータ:

model_name (必須): ロードするモデルの名前 (例: "unsloth/Llama-3.2-1B")
max_seq_length （オプション）：モデルの最大シーケンス長（デフォルト：2048）
load_in_4bit (オプション): モデルを4ビット量子化でロードするかどうか (デフォルト: true)
use_gradient_checkpointing (オプション): メモリを節約するために勾配チェックポイントを使用するかどうか (デフォルト: true)

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "load_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    max_seq_length: 4096,
    load_in_4bit: true
  }
});

微調整モデル

LoRA/QLoRA テクニックを使用して、Unsloth 最適化でモデルを微調整します。

パラメータ:

model_name (必須): 微調整するモデルの名前
dataset_name （必須）: 微調整に使用するデータセットの名前
output_dir (必須): 微調整されたモデルを保存するディレクトリ
max_seq_length （オプション）：トレーニングの最大シーケンス長（デフォルト：2048）
lora_rank （オプション）：LoRA微調整のランク（デフォルト：16）
lora_alpha (オプション): LoRA 微調整のためのアルファ (デフォルト: 16)
batch_size （オプション）：トレーニングのバッチサイズ（デフォルト：2）
gradient_accumulation_steps (オプション): 勾配累積ステップ数 (デフォルト: 4)
learning_rate （オプション）：トレーニングの学習率（デフォルト：2e-4）
max_steps （オプション）：トレーニングステップの最大数（デフォルト：100）
dataset_text_field (オプション): テキストを含むデータセット内のフィールド (デフォルト: 'text')
load_in_4bit (オプション): 4ビット量子化を使用するかどうか (デフォルト: true)

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "tatsu-lab/alpaca",
    output_dir: "./fine-tuned-model",
    max_steps: 100,
    batch_size: 2,
    learning_rate: 2e-4
  }
});

テキストを生成する

微調整された Unsloth モデルを使用してテキストを生成します。

パラメータ:

model_path (必須): 微調整されたモデルへのパス
prompt （必須）: テキスト生成のプロンプト
max_new_tokens (オプション): 生成するトークンの最大数 (デフォルト: 256)
temperature （オプション）：テキスト生成時の温度（デフォルト：0.7）
top_p （オプション）：テキスト生成のTop-p（デフォルト：0.9）

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "generate_text",
  arguments: {
    model_path: "./fine-tuned-model",
    prompt: "Write a short story about a robot learning to paint:",
    max_new_tokens: 512,
    temperature: 0.8
  }
});

エクスポートモデル

微調整された Unsloth モデルを展開用にさまざまな形式でエクスポートします。

パラメータ:

model_path (必須): 微調整されたモデルへのパス
export_format (必須): エクスポートする形式 (gguf、ollama、vllm、huggingface)
output_path (必須): エクスポートされたモデルを保存するパス
quantization_bits （オプション）：量子化ビット数（GGUFエクスポート用）（デフォルト：4）

例：

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "export_model",
  arguments: {
    model_path: "./fine-tuned-model",
    export_format: "gguf",
    output_path: "./exported-model.gguf",
    quantization_bits: 4
  }
});

高度な使用法

カスタムデータセット

カスタムデータセットは、適切にフォーマットして Hugging Face でホストするか、ローカルパスを提供することで使用できます。

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "json",
    data_files: {"train": "path/to/your/data.json"},
    output_dir: "./fine-tuned-model"
  }
});

メモリ最適化

限られたハードウェア上の大規模モデルの場合:

バッチサイズを減らし、勾配蓄積ステップを増やす
4ビット量子化を使用する
勾配チェックポイントを有効にする
可能であればシーケンスの長さを短くする

トラブルシューティング

よくある問題

CUDA メモリ不足: バッチサイズを小さくするか、4 ビット量子化を使用するか、より小さなモデルを試してください
インポートエラー: torch、transformers、unsloth の正しいバージョンがインストールされていることを確認してください
モデルが見つかりません: サポートされているモデル名を使用しているか、プライベートモデルにアクセスできるかどうかを確認してください

バージョンの互換性

Python: 3.10、3.11、または 3.12 (3.13 は不可)
CUDA: 11.8 または 12.1 以上を推奨
PyTorch: 2.0以上を推奨

パフォーマンスベンチマーク

モデル	VRAM	アンスローススピード	VRAM削減	コンテキストの長さ
ラマ 3.3 (70B)	80GB	2倍高速	>75%	13倍長い
ラマ 3.1 (8B)	80GB	2倍高速	>70%	12倍長い
ミストラル v0.3 (7B)	80GB	2.2倍高速	75%減	-

要件

Python 3.10-3.12
CUDA 対応の NVIDIA GPU (推奨)
Node.jsとnpm

ライセンス

Apache 2.0

Install Server

HTTP connection URL

security – no known vulnerabilities

license - not found

quality - confirmed to work

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Tools

Unsloth を使用して大規模な言語モデルを最適化、微調整、展開するためのツールを提供し、モデルの読み込み、微調整、テキスト生成、モデルのエクスポート機能により、メモリを 80% 削減しながらトレーニングを 2 倍高速化します。

Related MCP Servers

MemGPT MCP Server
Vic563
-
security
F
license
-
quality
A TypeScript-based server that provides a memory system for Large Language Models (LLMs), allowing users to interact with multiple LLM providers while maintaining conversation history and offering tools for managing providers and model configurations.
Last updated -
25
JavaScript
Model Context Protocol (MCP) Server
hideya
-
security
A
license
-
quality
This server facilitates the invocation of AI models from providers like Anthropic, OpenAI, and Groq, enabling users to manage and configure large language model interactions seamlessly.
Last updated -
9
Python
MIT License
File Context MCP
compiledwithproblems
-
security
F
license
-
quality
This server provides an API to query Large Language Models using context from local files, supporting various models and file types for context-aware responses.
Last updated -
1
TypeScript
Model Context Provider (MCP) Server
Mark850409
-
security
F
license
-
quality
Facilitates enhanced interaction with large language models (LLMs) by providing intelligent context management, tool integration, and multi-provider AI model coordination for efficient AI-driven workflows.
Last updated -
Python

View all related MCP servers

Unsloth MCP Server