⚡ WarpGBM MCP Service

GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Get artifact_id for <100ms cached predictions • Download portable artifacts

smithery badge License: GPL v3 Modal MCP X402

🌐 Live Service • 📖 API Docs • 🤖 Agent Guide • 🐍 Python Package

🎯 What is This?

Outsource your GBDT workload to the world's fastest GPU implementation.

WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.

🏗️ How It Works (The Smart Cache Workflow)

graph LR A[Train on GPU] --> B[Get artifact_id + model] B --> C[5min Cache] C --> D[<100ms Predictions] B --> E[Download Artifact] E --> F[Use Anywhere]

Train: POST your data → Train on A10G GPU → Get artifact_id + portable artifact
Fast Path: Use artifact_id → Sub-100ms cached predictions (5min TTL)
Slow Path: Use model_artifact_joblib → Download and use anywhere

Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts

⚡ Quick Start

For AI Agents (MCP)

Add to your MCP settings (e.g., .cursor/mcp.json):

{ "mcpServers": { "warpgbm": { "url": "https://warpgbm.ai/mcp/sse" } } }

For Developers (REST API)

# 1. Train a model curl -X POST https://warpgbm.ai/train \ -H "Content-Type: application/json" \ -d '{ "X": [[5.1,3.5,1.4,0.2], [6.7,3.1,4.4,1.4], ...], "y": [0, 1, 2, ...], "model_type": "warpgbm", "objective": "multiclass" }' # Response includes artifact_id for fast predictions # {"artifact_id": "abc-123", "model_artifact_joblib": "H4sIA..."} # 2. Make fast predictions (cached, <100ms) curl -X POST https://warpgbm.ai/predict_from_artifact \ -H "Content-Type: application/json" \ -d '{ "artifact_id": "abc-123", "X": [[5.0,3.4,1.5,0.2]] }'

🚀 Key Features

Feature	Description
🎯 Multi-Model	WarpGBM (GPU) + LightGBM (CPU)
⚡ Smart Caching	`artifact_id` → 5min cache → <100ms inference
📦 Portable Artifacts	Download joblib models, use anywhere
🤖 MCP Native	Direct tool integration for AI agents
💰 X402 Payments	Optional micropayments (Base network)
🔒 Stateless	No data storage, you own your models
🌐 Production Ready	Deployed on Modal with custom domain

🐍 Python Package vs MCP Service

This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:

Feature	MCP Service (This Repo)	Python Package
Installation	None needed	`pip install git+https://...`
GPU	Cloud (pay-per-use)	Your GPU (free)
Control	REST API parameters	Full Python API
Features	Train, predict, upload	+ Cross-validation, callbacks, feature importance
Best For	Quick experiments, demos	Production pipelines, research
Cost	$0.01 per training	Free (your hardware)

Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control

📡 Available Endpoints

Core Endpoints

Method	Endpoint	Description
`GET`	`/models`	List available model backends
`POST`	`/train`	Train model, get artifact_id + model
`POST`	`/predict_from_artifact`	Fast predictions (artifact_id or model)
`POST`	`/predict_proba_from_artifact`	Probability predictions
`POST`	`/upload_data`	Upload CSV/Parquet for training
`POST`	`/feedback`	Submit feedback to improve service
`GET`	`/healthz`	Health check with GPU status

MCP Integration

Method	Endpoint	Description
`SSE`	`/mcp/sse`	MCP Server-Sent Events endpoint
`GET`	`/.well-known/mcp.json`	MCP capability manifest
`GET`	`/.well-known/x402`	X402 pricing manifest

💡 Complete Example: Iris Dataset

# 1. Train WarpGBM on Iris (60 samples recommended for proper binning) curl -X POST https://warpgbm.ai/train \ -H "Content-Type: application/json" \ -d '{ "X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2], [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5], [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2], [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5], [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2], [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5], [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2], [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5], [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2], [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5], [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2], [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]], "y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2, 0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2, 0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2], "model_type": "warpgbm", "objective": "multiclass", "n_estimators": 100 }' # Response: { "artifact_id": "abc123-def456-ghi789", "model_artifact_joblib": "H4sIA...", "training_time_seconds": 0.0 } # 2. Fast inference with cached artifact_id (<100ms) curl -X POST https://warpgbm.ai/predict_from_artifact \ -H "Content-Type: application/json" \ -d '{ "artifact_id": "abc123-def456-ghi789", "X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]] }' # Response: {"predictions": [0, 1, 2], "inference_time_seconds": 0.05} # Perfect classification! ✨

⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.

🏠 Self-Hosting

Local Development

# Clone repo git clone https://github.com/jefferythewind/mcp-warpgbm.git cd mcp-warpgbm # Setup environment python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt # Run locally (GPU optional for dev) uvicorn local_dev:app --host 0.0.0.0 --port 8000 --reload # Test curl http://localhost:8000/healthz

Deploy to Modal (Production)

# Install Modal pip install modal # Authenticate modal token new # Deploy modal deploy modal_app.py # Service will be live at your Modal URL

Deploy to Other Platforms

# Docker (requires GPU) docker build -t warpgbm-mcp . docker run --gpus all -p 8000:8000 warpgbm-mcp # Fly.io, Railway, Render, etc. # See their respective GPU deployment docs

🧪 Testing

# Install dev dependencies pip install -r requirements-dev.txt # Run all tests ./run_tests.sh # Or use pytest directly pytest tests/ -v # Test specific functionality pytest tests/test_train.py -v pytest tests/test_integration.py -v

📦 Project Structure

mcp-warpgmb/ ├── app/ │ ├── main.py # FastAPI app + routes │ ├── mcp_sse.py # MCP Server-Sent Events │ ├── model_registry.py # Model backend registry │ ├── models.py # Pydantic schemas │ ├── utils.py # Serialization, caching │ ├── x402.py # Payment verification │ └── feedback_storage.py # Feedback persistence ├── .well-known/ │ ├── mcp.json # MCP capability manifest │ └── x402 # X402 pricing manifest ├── docs/ │ ├── AGENT_GUIDE.md # Comprehensive agent docs │ ├── MODEL_SUPPORT.md # Model parameter reference │ └── WARPGBM_PYTHON_GUIDE.md ├── tests/ │ ├── test_train.py │ ├── test_predict.py │ ├── test_integration.py │ └── conftest.py ├── examples/ │ ├── simple_train.py │ └── compare_models.py ├── modal_app.py # Modal deployment config ├── local_dev.py # Local dev server ├── requirements.txt └── README.md

💰 Pricing (X402)

Optional micropayments on Base network:

Endpoint	Price	Description
`/train`	$0.01	Train model on GPU, get artifacts
`/predict_from_artifact`	$0.001	Batch predictions
`/predict_proba_from_artifact`	$0.001	Probability predictions
`/feedback`	Free	Help us improve!

Note: Payment is optional for demo/testing. See /.well-known/x402 for details.

🔐 Security & Privacy

✅ Stateless: No training data or models persisted
✅ Sandboxed: Runs in temporary isolated directories
✅ Size Limited: Max 50 MB request payload
✅ No Code Execution: Only structured JSON parameters
✅ Rate Limited: Per-IP throttling to prevent abuse
✅ Read-Only FS: Modal deployment uses immutable filesystem

🌍 Available Models

🚀 WarpGBM (GPU)

Acceleration: NVIDIA A10G GPUs
Speed: 13× faster than LightGBM
Best For: Time-series, financial modeling, temporal data
Special: Era-aware splitting, invariant learning
Min Samples: 60+ recommended

⚡ LightGBM (CPU)

Acceleration: Highly optimized CPU
Speed: 10-100× faster than sklearn
Best For: General tabular data, large datasets
Special: Categorical features, low memory
Min Samples: 20+

🗺️ Roadmap

Core training + inference endpoints
Smart artifact caching (5min TTL)
MCP Server-Sent Events integration
X402 payment verification
Modal deployment with GPU
Custom domain (warpgbm.ai)
Smithery marketplace listing
ONNX export support
Async job queue for large datasets
S3/IPFS dataset URL support
Python client library (warpgbm-client)
Additional model backends (XGBoost, CatBoost)

💬 Feedback & Support

Help us make this service better for AI agents!

Submit feedback about:

Missing features that would unlock new use cases
Confusing documentation or error messages
Performance issues or timeout problems
Additional model types you'd like to see

# Via API curl -X POST https://warpgbm.ai/feedback \ -H "Content-Type: application/json" \ -d '{ "feedback_type": "feature_request", "message": "Add support for XGBoost backend", "severity": "medium" }'

Or via:

GitHub Issues: mcp-warpgbm/issues
GitHub Discussions: warpgbm/discussions
Email: support@warpgbm.ai

📚 Learn More

🐍 WarpGBM Python Package - The core library (91+ ⭐)
🤖 Agent Guide - Complete usage guide for AI agents
📖 API Docs - Interactive OpenAPI documentation
🔌 Model Context Protocol - MCP specification
💰 X402 Specification - Payment protocol for agents
☁️ Modal Docs - Serverless GPU platform

📄 License

GPL-3.0 (same as WarpGBM core)

This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.

🙏 Credits

Built with:

WarpGBM - GPU-accelerated GBDT library
Modal - Serverless GPU infrastructure
FastAPI - Modern Python web framework
LightGBM - Microsoft's GBDT library

Built with ❤️ for the open agent economy

⭐ Star on GitHub • 🚀 Try Live Service • 📖 Read the Docs

WarpGBM MCP Service