⚡ WarpGBM MCP Service
GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Getartifact_id
for <100ms cached predictions • Download portable artifacts
🌐 Live Service • 📖 API Docs • 🤖 Agent Guide • 🐍 Python Package
🎯 What is This?
Outsource your GBDT workload to the world's fastest GPU implementation.
WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.
🏗️ How It Works (The Smart Cache Workflow)
Train: POST your data → Train on A10G GPU → Get
artifact_id
+ portable artifactFast Path: Use
artifact_id
→ Sub-100ms cached predictions (5min TTL)Slow Path: Use
model_artifact_joblib
→ Download and use anywhere
Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts
⚡ Quick Start
For AI Agents (MCP)
Add to your MCP settings (e.g., .cursor/mcp.json
):
For Developers (REST API)
🚀 Key Features
Feature | Description |
🎯 Multi-Model | WarpGBM (GPU) + LightGBM (CPU) |
⚡ Smart Caching |
→ 5min cache → <100ms inference |
📦 Portable Artifacts | Download joblib models, use anywhere |
🤖 MCP Native | Direct tool integration for AI agents |
💰 X402 Payments | Optional micropayments (Base network) |
🔒 Stateless | No data storage, you own your models |
🌐 Production Ready | Deployed on Modal with custom domain |
🐍 Python Package vs MCP Service
This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:
Feature | MCP Service (This Repo) | |
Installation | None needed |
|
GPU | Cloud (pay-per-use) | Your GPU (free) |
Control | REST API parameters | Full Python API |
Features | Train, predict, upload | + Cross-validation, callbacks, feature importance |
Best For | Quick experiments, demos | Production pipelines, research |
Cost | $0.01 per training | Free (your hardware) |
Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control
📡 Available Endpoints
Core Endpoints
Method | Endpoint | Description |
|
| List available model backends |
|
| Train model, get artifact_id + model |
|
| Fast predictions (artifact_id or model) |
|
| Probability predictions |
|
| Upload CSV/Parquet for training |
|
| Submit feedback to improve service |
|
| Health check with GPU status |
MCP Integration
Method | Endpoint | Description |
|
| MCP Server-Sent Events endpoint |
|
| MCP capability manifest |
|
| X402 pricing manifest |
💡 Complete Example: Iris Dataset
⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.
🏠 Self-Hosting
Local Development
Deploy to Modal (Production)
Deploy to Other Platforms
🧪 Testing
📦 Project Structure
💰 Pricing (X402)
Optional micropayments on Base network:
Endpoint | Price | Description |
| $0.01 | Train model on GPU, get artifacts |
| $0.001 | Batch predictions |
| $0.001 | Probability predictions |
| Free | Help us improve! |
Note: Payment is optional for demo/testing. See
/.well-known/x402
for details.
🔐 Security & Privacy
✅ Stateless: No training data or models persisted
✅ Sandboxed: Runs in temporary isolated directories
✅ Size Limited: Max 50 MB request payload
✅ No Code Execution: Only structured JSON parameters
✅ Rate Limited: Per-IP throttling to prevent abuse
✅ Read-Only FS: Modal deployment uses immutable filesystem
🌍 Available Models
🚀 WarpGBM (GPU)
Acceleration: NVIDIA A10G GPUs
Speed: 13× faster than LightGBM
Best For: Time-series, financial modeling, temporal data
Special: Era-aware splitting, invariant learning
Min Samples: 60+ recommended
⚡ LightGBM (CPU)
Acceleration: Highly optimized CPU
Speed: 10-100× faster than sklearn
Best For: General tabular data, large datasets
Special: Categorical features, low memory
Min Samples: 20+
🗺️ Roadmap
Core training + inference endpoints
Smart artifact caching (5min TTL)
MCP Server-Sent Events integration
X402 payment verification
Modal deployment with GPU
Custom domain (warpgbm.ai)
Smithery marketplace listing
ONNX export support
Async job queue for large datasets
S3/IPFS dataset URL support
Python client library (
warpgbm-client
)Additional model backends (XGBoost, CatBoost)
💬 Feedback & Support
Help us make this service better for AI agents!
Submit feedback about:
Missing features that would unlock new use cases
Confusing documentation or error messages
Performance issues or timeout problems
Additional model types you'd like to see
Or via:
GitHub Issues: mcp-warpgbm/issues
GitHub Discussions: warpgbm/discussions
Email: support@warpgbm.ai
📚 Learn More
🐍 WarpGBM Python Package - The core library (91+ ⭐)
🤖 Agent Guide - Complete usage guide for AI agents
📖 API Docs - Interactive OpenAPI documentation
🔌 Model Context Protocol - MCP specification
💰 X402 Specification - Payment protocol for agents
☁️ Modal Docs - Serverless GPU platform
📄 License
GPL-3.0 (same as WarpGBM core)
This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.
🙏 Credits
Built with:
WarpGBM - GPU-accelerated GBDT library
Modal - Serverless GPU infrastructure
FastAPI - Modern Python web framework
LightGBM - Microsoft's GBDT library
Built with ❤️ for the open agent economy
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Provides GPU-accelerated gradient boosting model training and inference through a cloud service. Enables AI agents to train models on NVIDIA A10G GPUs and get fast cached predictions with portable model artifacts.