Skip to main content
Glama

⚡ WarpGBM MCP Service

GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Get artifact_id for <100ms cached predictions • Download portable artifacts

License: GPL v3 Modal MCP X402

🌐 Live Service📖 API Docs🤖 Agent Guide🐍 Python Package


🎯 What is This?

Outsource your GBDT workload to the world's fastest GPU implementation.

WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.

🏗️ How It Works (The Smart Cache Workflow)

graph LR
    A[Train on GPU] --> B[Get artifact_id + model]
    B --> C[5min Cache]
    C --> D[<100ms Predictions]
    B --> E[Download Artifact]
    E --> F[Use Anywhere]
  1. Train: POST your data → Train on A10G GPU → Get artifact_id + portable artifact

  2. Fast Path: Use artifact_id → Sub-100ms cached predictions (5min TTL)

  3. Slow Path: Use model_artifact_joblib → Download and use anywhere

Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts


⚡ Quick Start

For AI Agents (MCP)

Add to your MCP settings (e.g., .cursor/mcp.json):

{
  "mcpServers": {
    "warpgbm": {
      "url": "https://warpgbm.ai/mcp/sse"
    }
  }
}

For Developers (REST API)

# 1. Train a model
curl -X POST https://warpgbm.ai/train \
  -H "Content-Type: application/json" \
  -d '{
    "X": [[5.1,3.5,1.4,0.2], [6.7,3.1,4.4,1.4], ...],
    "y": [0, 1, 2, ...],
    "model_type": "warpgbm",
    "objective": "multiclass"
  }'

# Response includes artifact_id for fast predictions
# {"artifact_id": "abc-123", "model_artifact_joblib": "H4sIA..."}

# 2. Make fast predictions (cached, <100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
  -H "Content-Type: application/json" \
  -d '{
    "artifact_id": "abc-123",
    "X": [[5.0,3.4,1.5,0.2]]
  }'

🚀 Key Features

Feature

Description

🎯 Multi-Model

WarpGBM (GPU) + LightGBM (CPU)

Smart Caching

artifact_id → 5min cache → <100ms inference

📦 Portable Artifacts

Download joblib models, use anywhere

🤖 MCP Native

Direct tool integration for AI agents

💰 X402 Payments

Optional micropayments (Base network)

🔒 Stateless

No data storage, you own your models

🌐 Production Ready

Deployed on Modal with custom domain


🐍 Python Package vs MCP Service

This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:

Feature

MCP Service (This Repo)

Python Package

Installation

None needed

pip install git+https://...

GPU

Cloud (pay-per-use)

Your GPU (free)

Control

REST API parameters

Full Python API

Features

Train, predict, upload

+ Cross-validation, callbacks, feature importance

Best For

Quick experiments, demos

Production pipelines, research

Cost

$0.01 per training

Free (your hardware)

Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control


📡 Available Endpoints

Core Endpoints

Method

Endpoint

Description

GET

/models

List available model backends

POST

/train

Train model, get artifact_id + model

POST

/predict_from_artifact

Fast predictions (artifact_id or model)

POST

/predict_proba_from_artifact

Probability predictions

POST

/upload_data

Upload CSV/Parquet for training

POST

/feedback

Submit feedback to improve service

GET

/healthz

Health check with GPU status

MCP Integration

Method

Endpoint

Description

SSE

/mcp/sse

MCP Server-Sent Events endpoint

GET

/.well-known/mcp.json

MCP capability manifest

GET

/.well-known/x402

X402 pricing manifest


💡 Complete Example: Iris Dataset

# 1. Train WarpGBM on Iris (60 samples recommended for proper binning)
curl -X POST https://warpgbm.ai/train \
  -H "Content-Type: application/json" \
  -d '{
  "X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
        [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
        [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]],
  "y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
        0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
        0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2],
  "model_type": "warpgbm",
  "objective": "multiclass",
  "n_estimators": 100
}'

# Response:
{
  "artifact_id": "abc123-def456-ghi789",
  "model_artifact_joblib": "H4sIA...",
  "training_time_seconds": 0.0
}

# 2. Fast inference with cached artifact_id (<100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
  -H "Content-Type: application/json" \
  -d '{
  "artifact_id": "abc123-def456-ghi789",
  "X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]]
}'

# Response: {"predictions": [0, 1, 2], "inference_time_seconds": 0.05}
# Perfect classification! ✨

⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.


🏠 Self-Hosting

Local Development

# Clone repo
git clone https://github.com/jefferythewind/mcp-warpgbm.git
cd mcp-warpgbm

# Setup environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run locally (GPU optional for dev)
uvicorn local_dev:app --host 0.0.0.0 --port 8000 --reload

# Test
curl http://localhost:8000/healthz

Deploy to Modal (Production)

# Install Modal
pip install modal

# Authenticate
modal token new

# Deploy
modal deploy modal_app.py

# Service will be live at your Modal URL

Deploy to Other Platforms

# Docker (requires GPU)
docker build -t warpgbm-mcp .
docker run --gpus all -p 8000:8000 warpgbm-mcp

# Fly.io, Railway, Render, etc.
# See their respective GPU deployment docs

🧪 Testing

# Install dev dependencies
pip install -r requirements-dev.txt

# Run all tests
./run_tests.sh

# Or use pytest directly
pytest tests/ -v

# Test specific functionality
pytest tests/test_train.py -v
pytest tests/test_integration.py -v

📦 Project Structure

mcp-warpgmb/
├── app/
│   ├── main.py              # FastAPI app + routes
│   ├── mcp_sse.py           # MCP Server-Sent Events
│   ├── model_registry.py    # Model backend registry
│   ├── models.py            # Pydantic schemas
│   ├── utils.py             # Serialization, caching
│   ├── x402.py              # Payment verification
│   └── feedback_storage.py  # Feedback persistence
├── .well-known/
│   ├── mcp.json             # MCP capability manifest
│   └── x402                 # X402 pricing manifest
├── docs/
│   ├── AGENT_GUIDE.md       # Comprehensive agent docs
│   ├── MODEL_SUPPORT.md     # Model parameter reference
│   └── WARPGBM_PYTHON_GUIDE.md
├── tests/
│   ├── test_train.py
│   ├── test_predict.py
│   ├── test_integration.py
│   └── conftest.py
├── examples/
│   ├── simple_train.py
│   └── compare_models.py
├── modal_app.py             # Modal deployment config
├── local_dev.py             # Local dev server
├── requirements.txt
└── README.md

💰 Pricing (X402)

Optional micropayments on Base network:

Endpoint

Price

Description

/train

$0.01

Train model on GPU, get artifacts

/predict_from_artifact

$0.001

Batch predictions

/predict_proba_from_artifact

$0.001

Probability predictions

/feedback

Free

Help us improve!

Note: Payment is optional for demo/testing. See /.well-known/x402 for details.


🔐 Security & Privacy

Stateless: No training data or models persisted
Sandboxed: Runs in temporary isolated directories
Size Limited: Max 50 MB request payload
No Code Execution: Only structured JSON parameters
Rate Limited: Per-IP throttling to prevent abuse
Read-Only FS: Modal deployment uses immutable filesystem


🌍 Available Models

🚀 WarpGBM (GPU)

  • Acceleration: NVIDIA A10G GPUs

  • Speed: 13× faster than LightGBM

  • Best For: Time-series, financial modeling, temporal data

  • Special: Era-aware splitting, invariant learning

  • Min Samples: 60+ recommended

⚡ LightGBM (CPU)

  • Acceleration: Highly optimized CPU

  • Speed: 10-100× faster than sklearn

  • Best For: General tabular data, large datasets

  • Special: Categorical features, low memory

  • Min Samples: 20+


🗺️ Roadmap

  • Core training + inference endpoints

  • Smart artifact caching (5min TTL)

  • MCP Server-Sent Events integration

  • X402 payment verification

  • Modal deployment with GPU

  • Custom domain (warpgbm.ai)

  • Smithery marketplace listing

  • ONNX export support

  • Async job queue for large datasets

  • S3/IPFS dataset URL support

  • Python client library (warpgbm-client)

  • Additional model backends (XGBoost, CatBoost)


💬 Feedback & Support

Help us make this service better for AI agents!

Submit feedback about:

  • Missing features that would unlock new use cases

  • Confusing documentation or error messages

  • Performance issues or timeout problems

  • Additional model types you'd like to see

# Via API
curl -X POST https://warpgbm.ai/feedback \
  -H "Content-Type: application/json" \
  -d '{
    "feedback_type": "feature_request",
    "message": "Add support for XGBoost backend",
    "severity": "medium"
  }'

Or via:


📚 Learn More


📄 License

GPL-3.0 (same as WarpGBM core)

This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.


🙏 Credits

Built with:

  • WarpGBM - GPU-accelerated GBDT library

  • Modal - Serverless GPU infrastructure

  • FastAPI - Modern Python web framework

  • LightGBM - Microsoft's GBDT library


Built with ❤️ for the open agent economy

⭐ Star on GitHub🚀 Try Live Service📖 Read the Docs

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jefferythewind/warpgbm-mcp-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server