AGENT_GUIDE.mdβ’14.6 kB
# WarpGBM MCP Service - Agent Guide
## π― What is This Service?
**Outsource your GBDT workload to the world's fastest GPU implementation.**
WarpGBM MCP is a **cloud GPU gradient boosting service** that gives AI agents instant access to GPU-accelerated training. Train models on our A10G GPUs, receive portable artifacts, and cache them for millisecond inference. No GPU required on your end.
### ποΈ How It Works (The Smart Cache Workflow)
1. **Train**: POST your data β Train on our A10G GPUs β Get `artifact_id` + model artifact
2. **Cache**: `artifact_id` cached for 5 minutes β Sub-100ms predictions
3. **Inference**:
- **Online (Fast)**: Use `artifact_id` for cached predictions
- **Offline**: Download `model_artifact_joblib` for local/production use
**Architecture**: Stateless service. No model storage. You own your artifacts. The `artifact_id` is your express lane for rapid inference during development.
### π‘ Quick Start (30 seconds)
```json
// 1. Train (returns artifact_id)
POST /train
{
"X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2], ...],
"y": [0, 1, 2, ...]
}
β Response: { "artifact_id": "abc123...", "model_artifact_joblib": "H4sIA..." }
// 2. Predict (using cached artifact_id - <100ms)
POST /predict_from_artifact
{
"artifact_id": "abc123...",
"X": [[5.1,3.5,1.4,0.2]]
}
β Response: { "predictions": [0], "inference_time_seconds": 0.05 }
```
**Architecture**: Stateless service. No model storage. You own your artifacts. Use them in production, locally, or via our caching layer for fast online serving.
### Available Models
- **WarpGBM**: GPU-accelerated, 13Γ faster than LightGBM, custom CUDA kernels, invariant learning
- **LightGBM**: CPU-optimized, Microsoft's distributed gradient boosting, battle-tested
---
## π About the WarpGBM Python Package
**This MCP service is a cloud API wrapper around the WarpGBM Python package.**
### Want More Control? Use WarpGBM Directly!
For production ML workflows, consider using the **WarpGBM Python package** directly:
- **GitHub**: https://github.com/jefferythewind/warpgbm (91+ β)
- **Full Agent Guide**: https://github.com/jefferythewind/warpgbm/blob/main/AGENT_GUIDE.md
- **License**: GPL-3.0
### MCP Service vs Python Package
| Feature | MCP Service (This) | Python Package |
|---------|-------------------|----------------|
| Installation | None needed | `pip install git+https://github.com/jefferythewind/warpgbm.git` |
| GPU Access | Cloud (pay-per-use) | Your local GPU (free) |
| API | REST + MCP tools | Full Python API |
| Control | Limited parameters | Full control + custom losses |
| Features | Train, predict, upload | + Cross-validation, feature importance, era analysis |
| Best For | Quick experiments | Production ML pipelines |
**Use this MCP service for**: Quick tests, prototyping, no local GPU
**Use Python package for**: Production, research, full control, cost savings
### Installation (Python Package)
```bash
# Standard
pip install git+https://github.com/jefferythewind/warpgbm.git
# Colab
!pip install warpgbm --no-build-isolation
```
See [WARPGBM_PYTHON_GUIDE.md](./WARPGBM_PYTHON_GUIDE.md) for complete Python package documentation.
---
## π Available Models
### WarpGBM
- **Acceleration**: GPU (CUDA)
- **Best for**: Time-series data, financial modeling, temporal datasets with era/time structure
- **Special features**: Era-aware splitting, GPU-accelerated training
- **Status**: CPU training available now, GPU coming soon
### LightGBM
- **Acceleration**: CPU (highly optimized)
- **Best for**: General-purpose ML, tabular data, large datasets (10K-10M+ rows)
- **Special features**: Fast training, low memory usage, handles categorical features
- **Performance**: 10-100x faster than basic sklearn for large datasets
---
## π οΈ Available Tools
### 1. `list_models`
**Purpose**: List all available model backends
**Parameters**: None
**Returns**:
```json
{
"models": ["warpgbm", "lightgbm"],
"default": "warpgbm"
}
```
**When to use**: To show users what models are available
---
### 2. `train`
**Purpose**: Train a gradient boosting model and get a portable artifact
**Required Parameters**:
- `X`: Feature matrix (2D array of floats) - e.g., `[[1.0, 2.0], [3.0, 4.0]]`
- `y`: Target labels (floats for regression, integers for classification)
**Optional Parameters**:
- `model_type`: `"warpgbm"` or `"lightgbm"` (default: `"warpgbm"`)
- `objective`: `"regression"`, `"binary"`, or `"multiclass"` (default: `"multiclass"`)
- `n_estimators`: Number of trees (default: 100)
- `learning_rate`: Learning rate (default: 0.1)
- `max_depth`: Maximum tree depth (default: 6)
- `num_class`: Number of classes for multiclass (auto-detected if not provided)
- `export_joblib`: Return joblib artifact (default: true)
- `export_onnx`: Return ONNX artifact (default: false, not yet implemented)
**Returns**:
```json
{
"model_type": "lightgbm",
"model_artifact_joblib": "base64_encoded_model...",
"model_artifact_onnx": null,
"num_samples": 100,
"num_features": 5,
"training_time_seconds": 0.234
}
```
**Important Notes**:
- **Minimum samples**: Need at least 2 samples
- **Binary classification**: Must have exactly 2 unique classes
- **Multiclass**: Must have 2+ unique classes
- **Regression**: Use float values in `y`
- **Classification**: Use integer values in `y` (will be cast to int automatically)
**Common Errors & Solutions**:
- β `"Binary classification requires exactly 2 classes, found 3"`
- β
Use `"objective": "multiclass"` instead
- β `"Training requires at least 2 samples"`
- β
Provide more data (minimum 2 rows)
- β `"X and y shape mismatch"`
- β
Ensure X has same number of rows as y has elements
---
### 3. `predict_from_artifact`
**Purpose**: Run inference using a trained model artifact
**β‘ Fast Path (Recommended)**:
- Use `artifact_id` from training response (cached for 5 minutes, <100ms inference)
**Slow Path**:
- Use `model_artifact_joblib` directly (requires deserialization, 200-500ms)
**Required Parameters**:
- `X`: Feature matrix for prediction (2D array)
- **EITHER** `artifact_id`: Cached artifact ID (from `train` response)
- **OR** `model_artifact_joblib`: Base64-encoded joblib model (from `train` response)
**Optional Parameters**:
- `format`: `"joblib"` or `"onnx"` (default: `"joblib"`)
**Returns**:
```json
{
"predictions": [0, 1, 0, 1],
"num_samples": 4,
"inference_time_seconds": 0.012
}
```
**Pro Tip**: Always use `artifact_id` during development for instant predictions. Save `model_artifact_joblib` for production/offline use.
---
## π‘ Usage Examples
### Example 1: Regression
```json
{
"X": [[1, 2], [3, 4], [5, 6], [7, 8]],
"y": [1.5, 3.2, 5.1, 7.3],
"model_type": "lightgbm",
"objective": "regression",
"n_estimators": 50,
"learning_rate": 0.1
}
```
### Example 2: Binary Classification
```json
{
"X": [[1, 2], [3, 4], [5, 6], [7, 8]],
"y": [0, 1, 0, 1],
"model_type": "lightgbm",
"objective": "binary",
"n_estimators": 100
}
```
### Example 3: Multiclass Classification
```json
{
"X": [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]],
"y": [0, 1, 2, 0, 1, 2],
"model_type": "lightgbm",
"objective": "multiclass",
"num_class": 3,
"n_estimators": 100
}
```
### Example 4: Full Training + Inference Workflow
```json
// Step 1: Train
{
"X": [[1, 2], [3, 4], [5, 6]],
"y": [0, 1, 0],
"objective": "binary"
}
// Response includes: model_artifact_joblib = "gASV..."
// Step 2: Predict
{
"X": [[2, 3], [4, 5]],
"model_artifact_joblib": "gASV..."
}
// Response: {"predictions": [0, 1]}
```
### Example 5: Iris Dataset (Proper Sample Size)
β οΈ **Important**: WarpGBM uses quantile binning which requires **60+ samples** (not 20!) for robust training. With insufficient data, the model can't learn proper decision boundaries.
```json
// Train on Iris with 60 samples (3Γ the base 20)
{
"X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]],
"y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2],
"model_type": "warpgbm",
"objective": "multiclass",
"n_estimators": 100
}
// Response includes artifact_id for smart caching
{
"artifact_id": "abc123-def456-...",
"model_artifact_joblib": "H4sIA...",
"training_time_seconds": 0.0
}
// Fast inference with cached artifact_id (< 100ms)
{
"artifact_id": "abc123-def456-...",
"X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]]
}
// Predictions: [0, 1, 2] β Perfect classification!
```
---
## π Best Practices
### For Users (Humans)
1. **Use sufficient training data**: WarpGBM needs **60+ samples** for proper binning (20 samples = poor results!)
2. **Use LightGBM** for general-purpose tasks (it's fast and reliable)
3. **Save the artifact_id** for fast cached predictions (5min TTL, < 100ms inference)
4. **Download the model_artifact_joblib** for offline/production use
5. **Match the objective** to your task (regression vs classification)
### For Agents (AI)
1. **Always validate input shapes** before calling `train`
2. **Use artifact_id for repeated predictions** - it's cached for 5 minutes and much faster
3. **Store model artifacts** from training responses for long-term use
4. **Handle errors gracefully** - parse the error message for actionable feedback
5. **Choose the right objective**:
- Continuous output β `"regression"`
- Two classes β `"binary"`
- 3+ classes β `"multiclass"`
6. **Ensure sufficient data**: Minimum 60+ samples for WarpGBM, 20+ for LightGBM
7. **Start with defaults** (100 trees, 0.1 learning rate, depth 6)
---
## π° Pricing (X402)
The service supports X402 micropayments on Base network:
- **Training**: 0.01 USDC per request
- **Inference**: 0.001 USDC per request
- **Model Listing**: Free
*Note: Payment enforcement is currently optional (demo mode)*
---
## π Service Info
- **Base URL**: `https://warpgbm.ai`
- **MCP Endpoint**: `https://warpgbm.ai/mcp/sse`
- **Health Check**: `GET https://warpgbm.ai/healthz`
- **API Docs**: `GET https://warpgbm.ai/docs`
- **MCP Manifest**: `GET https://warpgbm.ai/.well-known/mcp.json`
- **X402 Pricing**: `GET https://warpgbm.ai/.well-known/x402`
---
## π Common Workflows
### Workflow 1: Quick Classification
1. Call `list_models` to see options
2. Call `train` with your data (X, y) and `objective: "binary"` or `"multiclass"`
3. Save the `model_artifact_joblib` from the response
4. Call `predict_from_artifact` with new data and the saved artifact
### Workflow 2: Compare Models
1. Train the same data with `model_type: "lightgbm"`
2. Train the same data with `model_type: "warpgbm"`
3. Compare `training_time_seconds` and evaluate predictions
4. Choose the best model for your use case
### Workflow 3: Hyperparameter Tuning
1. Train with different `n_estimators` (50, 100, 200)
2. Train with different `learning_rate` (0.01, 0.1, 0.3)
3. Train with different `max_depth` (3, 6, 10)
4. Evaluate predictions on held-out test data
5. Use the best hyperparameters
---
## β οΈ Limitations
- **Max data size**: 50 MB per request
- **Cold start**: First request may take 5-15 seconds (GPU container spin-up for WarpGBM)
- **Timeout**: 10 minutes max per training request
- **GPU training**: β
Live on Modal! WarpGBM trains on NVIDIA A10G GPUs
- **Artifact cache**: `artifact_id` expires after 5 minutes (save `model_artifact_joblib` for long-term use)
- **ONNX export**: Not yet implemented
---
## π Troubleshooting
### "Loading tools" forever in Cursor
- **Cause**: MCP server connection issue
- **Fix**: Restart Cursor completely, check URL in settings.json
### Training returns empty response
- **Cause**: Container cold start timeout
- **Fix**: Retry after 10-15 seconds, container will be warm
### "Invalid training data" error
- **Cause**: Wrong objective for data type (e.g., regression with integer labels)
- **Fix**: Match objective to your data type
### "Module not found" errors
- **Cause**: Service deployment issue
- **Fix**: Report to service owner, temporary outage
---
## π Support & Feedback
- **Service Owner**: jefferythewind
- **MCP Service Repo**: https://github.com/jefferythewind/warpgbm-mcp-service (this service)
- **WarpGBM Python Package**: https://github.com/jefferythewind/warpgbm (core library)
- **Modal Dashboard**: https://modal.com/apps/tdelise/main/deployed/warpgbm-mcp
### π¬ We Want Your Feedback!
**Help us make this service better for AI agents!**
Please submit feedback about what would help you most:
- Missing features that would unlock new use cases?
- Confusing documentation or error messages?
- Performance issues or timeout problems?
- Additional model types you'd like to see?
- Better examples or workflows?
**Submit via MCP tool**:
```json
{
"feedback_type": "feature_request", // or "bug", "documentation", "performance", "general"
"message": "I'd love to see...",
"severity": "medium"
}
```
**Or POST directly**:
```bash
curl -X POST https://warpgbm.ai/feedback \
-H "Content-Type: application/json" \
-d '{
"feedback_type": "feature_request",
"message": "Add support for XGBoost backend",
"severity": "low"
}'
```
Your feedback directly shapes our roadmap. Every submission is read and considered! π
---
**Last Updated**: 2025-10-17
**Version**: 1.1.0
**Protocol**: MCP 2024-11-05