Enables use of Hetzner VPS instances as remote training machines with SSH access, automatic environment setup, dataset syncing, and cost tracking for ML model training.
Supports Hostinger VPS instances as remote training infrastructure with SSH access, environment configuration, and persistent training sessions for ML workloads.
Integrates with Hugging Face Hub for model storage and retrieval as part of the ML training and fine-tuning workflow.
Provides serverless GPU function execution through Modal for ML model training and fine-tuning tasks.
Enables deployment of fine-tuned GGUF models to Ollama, pulling models from the registry, running inference, and managing local models (list, delete, copy).
Supports fine-tuning of GPT models (GPT-4o, GPT-3.5) through OpenAI's fine-tuning API with cost estimation and training status monitoring.
Allows registration and use of OVH VPS instances as SSH-accessible training machines with automatic setup and dataset synchronization for ML workloads.
Integrates with Replicate model hub for model storage and deployment as part of the ML training pipeline.
Uses SQLite for experiment tracking storage, maintaining version control, metrics history, and enabling experiment comparison and forking.
Provides optional VPN requirement for secure VPS connections, with status monitoring and connection management for remote training infrastructure.
Runs training sessions in tmux on remote VPS instances to maintain persistent training processes across SSH disconnections.
ML Lab MCP
A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.
Features
Unified Credential Management
Encrypted vault for API keys (Lambda Labs, RunPod, Mistral, OpenAI, Together AI, etc.)
PBKDF2 key derivation with AES encryption
Never stores credentials in plaintext
Dataset Management
Register datasets from local files (JSONL, CSV, Parquet)
Automatic schema inference and statistics
Train/val/test splitting
Template-based transformations
Experiment Tracking
SQLite-backed experiment storage
Version control and comparison
Fork experiments with config modifications
Full metrics history
Multi-Backend Training
Local: transformers + peft + trl for local GPU training
Mistral API: Native fine-tuning for Mistral models
Together AI: Hosted fine-tuning service
OpenAI: GPT model fine-tuning
Cloud GPU Provisioning
Lambda Labs: H100, A100 instances
RunPod: Spot and on-demand GPUs
Automatic price comparison across providers
Smart routing based on cost and availability
Remote VPS Support
Use any SSH-accessible machine (Hetzner, Hostinger, OVH, home server, university cluster)
Automatic environment setup
Dataset sync via rsync
Training runs in tmux (persistent across disconnects)
Amortized hourly cost calculation from monthly fees
Cost Estimation
Pre-training cost estimates across all providers
Real-time pricing queries
Time estimates based on model and dataset size
Ollama Integration
Deploy fine-tuned GGUF models to Ollama
Pull models from Ollama registry
Chat/inference testing directly from MCP
Model management (list, delete, copy)
Open WebUI Integration
Create model presets with system prompts
Knowledge base management (RAG)
Chat through Open WebUI (applies configs + knowledge)
Seamless Ollama ↔ Open WebUI workflow
Installation
Quick Start
1. Initialize and Create Vault
2. Add Provider Credentials
3. Configure with Claude Code / Claude Desktop
Add to your MCP configuration:
MCP Tools
Credentials
Tool | Description |
| Create encrypted credential vault |
| Unlock vault with password |
| Add provider credentials |
| List configured providers |
| Verify credentials work |
Datasets
Tool | Description |
| Register a local dataset file |
| List all datasets |
| View schema and statistics |
| Preview samples |
| Create train/val/test splits |
| Apply template transformations |
Experiments
Tool | Description |
| Create new experiment |
| List experiments |
| Get experiment details |
| Compare multiple experiments |
| Fork with modifications |
Training
Tool | Description |
| Estimate cost/time across providers |
| Start training run |
| Check run status |
| Stop training |
Infrastructure
Tool | Description |
| List available GPUs with pricing |
| Provision cloud instance |
| Terminate instance |
Remote VPS
Tool | Description |
| Register a VPS (host, user, key, GPU info, monthly cost) |
| List all registered VPS machines |
| Check VPS status (online, GPU, running jobs) |
| Remove a VPS from registry |
| Install training dependencies on VPS |
| Sync dataset to VPS |
| Run command on VPS |
| Get training logs from VPS |
Ollama
Tool | Description |
| Check Ollama status (running, version, GPU) |
| List models in Ollama |
| Pull model from registry |
| Deploy GGUF to Ollama |
| Chat with a model |
| Delete a model |
Open WebUI
Tool | Description |
| Check Open WebUI connection |
| List model configurations |
| Create model preset (system prompt, params) |
| Delete model configuration |
| List knowledge bases |
| Create knowledge base |
| Add file to knowledge base |
| Chat through Open WebUI |
Security
Tool | Description |
| View recent audit log entries |
| Get audit activity summary |
| Check Tailscale VPN connection |
| Rotate SSH key for a VPS |
| Check credential expiry status |
| Rotate credentials for a provider |
Example Workflow
Architecture
Security
Credentials encrypted with Fernet (AES-128-CBC)
PBKDF2-SHA256 key derivation (480,000 iterations)
Vault file permissions set to 600 (owner read/write only)
API keys never logged or transmitted unencrypted
Audit logging: All sensitive operations logged to
~/.cache/ml-lab/audit.logCredential expiry: Automatic tracking with rotation reminders
Tailscale support: Optional VPN requirement for VPS connections
SSH key rotation: Automated rotation with rollback on failure
Supported Providers
Compute Providers
Lambda Labs (H100, A100, A10)
RunPod (H100, A100, RTX 4090)
Modal (serverless GPU functions)
Fine-Tuning APIs
Mistral AI (Mistral, Mixtral, Codestral)
Together AI (Llama, Mistral, Qwen)
OpenAI (GPT-4o, GPT-3.5)
Model Hubs
Hugging Face Hub
Replicate
Contributing
Contributions welcome! Please read CONTRIBUTING.md for guidelines.
License
PolyForm Noncommercial 1.0.0 - free for personal use, contact for commercial licensing.
See LICENSE for details.