언슬로스 MCP 서버

Unsloth 용 MCP 서버 - LLM 미세 조정을 80% 적은 메모리로 2배 더 빠르게 만드는 라이브러리입니다.

언슬로스란 무엇인가?

Unsloth는 대규모 언어 모델의 미세 조정 효율성을 획기적으로 향상시키는 라이브러리입니다.

속도 : 표준 방식 대비 2배 빠른 미세 조정
메모리 : VRAM 사용량이 80% 감소하여 소비자용 GPU에서 더 큰 모델의 미세 조정이 가능
컨텍스트 길이 : 최대 13배 더 긴 컨텍스트 길이(예: 80GB GPU에서 Llama 3.3의 경우 89K 토큰)
정확도 : 모델 품질이나 성능 저하 없음

Unsloth는 OpenAI의 Triton 언어로 작성된 맞춤형 CUDA 커널, 최적화된 역전파, 동적 4비트 양자화를 통해 이러한 개선을 달성합니다.

특징

Llama, Mistral, Phi, Gemma 및 기타 모델에 대한 미세 조정 최적화
효율적인 학습을 위한 4비트 양자화
확장된 컨텍스트 길이 지원
모델 로딩, 미세 조정 및 추론을 위한 간단한 API
다양한 포맷(GGUF, Hugging Face 등)으로 내보내기

빠른 시작

Unsloth 설치: pip install unsloth
서버를 설치하고 빌드합니다.지엑스피1
MCP 설정에 추가:
{ "mcpServers": { "unsloth-server": { "command": "node", "args": ["/path/to/unsloth-server/build/index.js"], "env": { "HUGGINGFACE_TOKEN": "your_token_here" // Optional }, "disabled": false, "autoApprove": [] } } }

사용 가능한 도구

설치 확인

Unsloth가 시스템에 제대로 설치되었는지 확인하세요.

매개변수 : 없음

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "check_installation",
  arguments: {}
});

지원되는 모델 목록

Llama, Mistral, Phi, Gemma 변형을 포함하여 Unsloth가 지원하는 모든 모델 목록을 받아보세요.

매개변수 : 없음

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "list_supported_models",
  arguments: {}
});

로드_모델

더 빠른 추론과 미세 조정을 위해 Unsloth 최적화를 적용한 사전 학습된 모델을 로드합니다.

매개변수 :

model_name (필수): 로드할 모델의 이름(예: "unsloth/Llama-3.2-1B")
max_seq_length (선택 사항): 모델의 최대 시퀀스 길이(기본값: 2048)
load_in_4bit (선택 사항): 모델을 4비트 양자화로 로드할지 여부(기본값: true)
use_gradient_checkpointing (선택 사항): 메모리를 절약하기 위해 그래디언트 체크포인팅을 사용할지 여부(기본값: true)

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "load_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    max_seq_length: 4096,
    load_in_4bit: true
  }
});

미세 조정_모델

LoRA/QLoRA 기술을 사용하여 Unsloth 최적화로 모델을 미세 조정합니다.

매개변수 :

model_name (필수): 미세 조정할 모델의 이름
dataset_name (필수): 미세 조정에 사용할 데이터 세트의 이름
output_dir (필수): 미세 조정된 모델을 저장할 디렉토리
max_seq_length (선택 사항): 학습을 위한 최대 시퀀스 길이(기본값: 2048)
lora_rank (선택 사항): LoRA 미세 조정을 위한 순위(기본값: 16)
lora_alpha (선택 사항): LoRA 미세 조정을 위한 알파(기본값: 16)
batch_size (선택 사항): 학습을 위한 배치 크기(기본값: 2)
gradient_accumulation_steps (선택 사항): 그래디언트 축적 단계 수(기본값: 4)
learning_rate (선택 사항): 학습을 위한 학습 속도(기본값: 2e-4)
max_steps (선택 사항): 최대 학습 단계 수(기본값: 100)
dataset_text_field (선택 사항): 텍스트가 포함된 데이터 세트의 필드(기본값: 'text')
load_in_4bit (선택 사항): 4비트 양자화를 사용할지 여부(기본값: true)

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "tatsu-lab/alpaca",
    output_dir: "./fine-tuned-model",
    max_steps: 100,
    batch_size: 2,
    learning_rate: 2e-4
  }
});

텍스트 생성

정교하게 조정된 Unsloth 모델을 사용하여 텍스트를 생성합니다.

매개변수 :

model_path (필수): 미세 조정된 모델에 대한 경로
prompt (필수): 텍스트 생성을 위한 프롬프트
max_new_tokens (선택 사항): 생성할 최대 토큰 수(기본값: 256)
temperature (선택 사항): 텍스트 생성을 위한 온도(기본값: 0.7)
top_p (선택 사항): 텍스트 생성을 위한 Top-p(기본값: 0.9)

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "generate_text",
  arguments: {
    model_path: "./fine-tuned-model",
    prompt: "Write a short story about a robot learning to paint:",
    max_new_tokens: 512,
    temperature: 0.8
  }
});

내보내기_모델

미세하게 조정된 Unsloth 모델을 다양한 형식으로 내보내 배포합니다.

매개변수 :

model_path (필수): 미세 조정된 모델에 대한 경로
export_format (필수): 내보낼 형식(gguf, ollama, vllm, huggingface)
output_path (필수): 내보낸 모델을 저장할 경로
quantization_bits (선택 사항): 양자화를 위한 비트(GGUF 내보내기용)(기본값: 4)

예 :

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "export_model",
  arguments: {
    model_path: "./fine-tuned-model",
    export_format: "gguf",
    output_path: "./exported-model.gguf",
    quantization_bits: 4
  }
});

고급 사용법

사용자 정의 데이터 세트

사용자 정의 데이터 세트를 올바르게 포맷하고 Hugging Face에 호스팅하거나 로컬 경로를 제공하여 사용할 수 있습니다.

const result = await use_mcp_tool({
  server_name: "unsloth-server",
  tool_name: "finetune_model",
  arguments: {
    model_name: "unsloth/Llama-3.2-1B",
    dataset_name: "json",
    data_files: {"train": "path/to/your/data.json"},
    output_dir: "./fine-tuned-model"
  }
});

메모리 최적화

제한된 하드웨어를 사용하는 대형 모델의 경우:

배치 크기를 줄이고 그래디언트 축적 단계를 늘립니다.
4비트 양자화를 사용하세요
그래디언트 체크포인팅 활성화
가능하다면 시퀀스 길이를 줄이세요

문제 해결

일반적인 문제

CUDA 메모리 부족 : 배치 크기를 줄이거나 4비트 양자화를 사용하거나 더 작은 모델을 시도하세요.
가져오기 오류 : torch, transformers 및 unsloth의 올바른 버전이 설치되어 있는지 확인하세요.
모델을 찾을 수 없음 : 지원되는 모델 이름을 사용하고 있는지 또는 비공개 모델에 액세스할 수 있는지 확인하세요.

버전 호환성

Python: 3.10, 3.11 또는 3.12(3.13 아님)
CUDA: 11.8 또는 12.1+ 권장
PyTorch: 2.0 이상 권장

성능 벤치마크

모델	비디오램	느림보 속도	VRAM 감소	컨텍스트 길이
라마 3.3 (70B)	80GB	2배 더 빠름	>75%	13배 더 길다
라마 3.1 (8B)	80GB	2배 더 빠름	>70%	12배 더 길다
미스트랄 v0.3 (7B)	80GB	2.2배 더 빠름	75% 감소	-

요구 사항

파이썬 3.10-3.12
CUDA를 지원하는 NVIDIA GPU(권장)
Node.js와 npm

특허

아파치-2.0

Install Server

HTTP connection URL

security – no known vulnerabilities

license - not found

quality - confirmed to work

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Tools

Unsloth를 사용하여 대규모 언어 모델을 최적화, 미세 조정 및 배포하기 위한 도구를 제공하여 모델 로딩, 미세 조정, 텍스트 생성 및 모델 내보내기 기능을 통해 80% 적은 메모리로 2배 더 빠른 학습을 가능하게 합니다.

Related MCP Servers

MemGPT MCP Server
Vic563
-
security
F
license
-
quality
A TypeScript-based server that provides a memory system for Large Language Models (LLMs), allowing users to interact with multiple LLM providers while maintaining conversation history and offering tools for managing providers and model configurations.
Last updated -
25
JavaScript
Model Context Protocol (MCP) Server
hideya
-
security
A
license
-
quality
This server facilitates the invocation of AI models from providers like Anthropic, OpenAI, and Groq, enabling users to manage and configure large language model interactions seamlessly.
Last updated -
9
Python
MIT License
File Context MCP
compiledwithproblems
-
security
F
license
-
quality
This server provides an API to query Large Language Models using context from local files, supporting various models and file types for context-aware responses.
Last updated -
1
TypeScript
Model Context Provider (MCP) Server
Mark850409
-
security
F
license
-
quality
Facilitates enhanced interaction with large language models (LLMs) by providing intelligent context management, tool integration, and multi-provider AI model coordination for efficient AI-driven workflows.
Last updated -
Python

View all related MCP servers

Unsloth MCP Server