Skip to main content
Glama

timesfm-mcp

Local MCP server for GPU-backed TimesFM 2.5 forecasting.

This package exposes Google's TimesFM model to MCP clients over stdio, sse, or streamable-http. It is intended for local or trusted-network serving where agents need zero-shot time-series forecasts, prediction intervals, covariate forecasting, CSV forecasting, and interval-based anomaly scoring.

What It Provides

  • guidance: agent-facing usage guide for safe TimesFM forecasting.

  • health: package, CUDA, system, and model-state report with a CUDA matmul probe.

  • estimate_memory: rough dataset memory estimate before loading large jobs.

  • warmup: lazy-load and compile TimesFM on the GPU before live requests.

  • forecast_values: forecast in-memory numeric series.

  • forecast_csv: forecast numeric columns in a local CSV and write CSV or JSON.

  • forecast_with_covariates_values: TimesFM 2.5 XReg forecasts with known future covariates.

  • detect_anomalies: compare future actuals against q20/q80 and q10/q90 forecast bands.

  • timesfm://forecasting/guide: MCP resource with the same operational guidance.

  • timesfm_forecasting_guide: MCP prompt for agents before planning a forecast.

The model is a process singleton. It is loaded on first warmup or forecast call and remains in memory until the MCP server process exits. If a request changes model settings such as max_context, max_horizon, batch_size, or infer_is_positive, the server reloads the model with the new settings.

Related MCP server: WaveGuard

When To Use

Use this server for zero-shot univariate time-series forecasting:

  • Sales, demand, revenue, traffic, inventory, and capacity planning.

  • Sensor readings, vitals, load, weather, prices, and measurements.

  • Probabilistic forecasts where q10 through q90 prediction bands matter.

  • Known-future-covariate forecasts, such as price, promotion, holiday, weather, store attributes, product family, or region effects.

  • Forecast-vs-actual anomaly review using prediction intervals.

Do not use it for classification, clustering, causal interpretation, coefficient analysis, general tabular prediction, or model fine-tuning. Fine-tuning is a training workflow and is intentionally not exposed by this inference MCP server.

GPU Setup With uv

For RTX 5090 and other new NVIDIA GPUs, install a PyTorch wheel that supports the GPU architecture before installing this package. CUDA 12.8 wheels are the recommended starting point for this machine class.

git clone https://github.com/chokukil/timesfm-mcp.git
cd timesfm-mcp

uv venv .venv-gpu --python 3.10
source .venv-gpu/bin/activate

uv pip install --upgrade --reinstall \
  torch torchvision torchaudio \
  --index-url https://download.pytorch.org/whl/cu128

uv pip install -e ".[gpu]"
uv pip check

.[gpu] installs the TimesFM torch, XReg, and Flax-related extras, including einshape. If you only need standard torch forecasting without XReg/Flax dependencies, install .[torch]. If you need XReg but not Flax, install .[xreg].

Validate CUDA before starting MCP:

uv run --python .venv-gpu/bin/python python - <<'PY'
import torch
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))
x = torch.randn((512, 512), device="cuda")
y = x @ x
torch.cuda.synchronize()
print("cuda matmul ok")
PY

Start The Server

Start with a conservative batch size. Do not set CUDA_VISIBLE_DEVICES= unless you intentionally want to hide the GPU.

export TIMESFM_BATCH_SIZE=64
export PYTHONNOUSERSITE=1

timesfm-mcp --transport sse --host 0.0.0.0 --port 8765

The SSE endpoint will be:

http://<host>:8765/sse

For local stdio clients:

timesfm-mcp --transport stdio

For streamable HTTP:

timesfm-mcp --transport streamable-http --host 0.0.0.0 --port 8765

Environment Variables

Variable

Default

Meaning

TIMESFM_MODEL_ID

google/timesfm-2.5-200m-pytorch

Hugging Face model id or local model path.

TIMESFM_MAX_CONTEXT

1024

Maximum context points used by the compiled model.

TIMESFM_MAX_HORIZON

256

Maximum forecast horizon.

TIMESFM_BATCH_SIZE

64

TimesFM per_core_batch_size; raise only after memory is stable.

TIMESFM_NORMALIZE_INPUTS

true

Normalize each input series before forecasting.

TIMESFM_CONTINUOUS_QUANTILE_HEAD

true

Use continuous quantile head for better bands.

TIMESFM_FORCE_FLIP_INVARIANCE

true

Enforce sign symmetry.

TIMESFM_INFER_IS_POSITIVE

true

Clamp positive-only series to nonnegative outputs.

TIMESFM_FIX_QUANTILE_CROSSING

true

Enforce monotonic quantiles.

TIMESFM_RETURN_BACKCAST

false

Internal default; XReg requests force this to true.

TIMESFM_TORCH_COMPILE

false

Enable PyTorch compile when loading the model.

Set infer_is_positive=false per request, or TIMESFM_INFER_IS_POSITIVE=false for the process, when the metric can go below zero: returns, residuals, PnL, temperature anomalies, z-scores, or signed deltas.

Agent Workflow

  1. Read guidance or the timesfm://forecasting/guide resource.

  2. Call health. If cuda_probe.passed is false, fix PyTorch/CUDA before loading.

  3. Call estimate_memory for large workloads.

  4. Call warmup once if latency matters.

  5. Use the forecasting tool that matches the input shape.

  6. Interpret quantiles carefully: index 0 is mean, then q10 through q90.

q10 and q90 form the central 80 percent prediction interval. q20 and q80 form the central 60 percent interval.

Tool Examples

Forecast Values

{
  "inputs": [[10, 12, 11, 13, 15, 16, 18, 19, 21, 22, 24, 25, 27, 28, 30, 31, 33, 34, 36, 37, 39, 40, 42, 43, 45, 46, 48, 49, 51, 52, 54, 55]],
  "horizon": 7,
  "names": ["sales"],
  "infer_is_positive": true
}

Forecast With Covariates

Dynamic covariates must have length len(input_series) + horizon for each series. The tail values are known future covariates.

{
  "inputs": [[100, 101, 103, 105, 104, 106, 108, 109, 111, 113, 112, 114, 116, 118, 119, 120, 122, 124, 123, 125, 127, 129, 130, 132, 133, 135, 137, 138, 140, 141, 143, 145]],
  "horizon": 4,
  "dynamic_numerical_covariates": {
    "price": [[9.9, 9.9, 9.8, 9.8, 9.7, 9.7, 9.7, 9.6, 9.6, 9.6, 9.5, 9.5, 9.5, 9.5, 9.4, 9.4, 9.4, 9.3, 9.3, 9.3, 9.2, 9.2, 9.2, 9.1, 9.1, 9.1, 9.0, 9.0, 9.0, 8.9, 8.9, 8.9, 8.8, 8.8, 8.8, 8.8]]
  },
  "static_categorical_covariates": {
    "region": ["seoul"]
  },
  "xreg_mode": "xreg + timesfm"
}

Detect Anomalies

{
  "inputs": [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]],
  "actuals": [[33, 60]],
  "horizon": 2,
  "names": ["metric"]
}

Severity rules:

  • normal: actual is inside q20 to q80.

  • warning: actual is outside q20 to q80 but inside q10 to q90.

  • critical: actual is outside q10 to q90.

Security

The SSE and streamable HTTP transports do not add authentication by themselves. Bind to 127.0.0.1 for local-only use. Bind to 0.0.0.0 only on a trusted network or behind your own authentication, firewall, or reverse proxy.

Development

uv pip install -e ".[gpu,dev]"
pytest -q
ruff check .

License

Apache-2.0. This repository wraps TimesFM and depends on the upstream timesfm Python package and model weights.

A
license - permissive license
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chokukil/timesfm-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server