Skip to main content
Glama
kvrancic

prime-intellect-mcp

by kvrancic

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
PRIME_API_KEYYesYour Prime Intellect API key (starts with pit_)
PRIME_MAX_TOTAL_USDNoHard cap on total spend in USD for a single pod40
PRIME_MAX_HOURLY_USDNoHard cap on hourly spend in USD5

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
logging
{}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
extensions
{
  "io.modelcontextprotocol/ui": {}
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
list_gpu_typesA

List every GPU type Prime Intellect currently offers (e.g. "H100_80GB", "A100_80GB").

Use this when the user is vague about what they want. Pass the result into list_availability or pod_quote.

list_availabilityA

List currently-available GPU pods that match the filters.

Returns the SDK's GPUAvailability rows (cloud_id, gpu_type, gpu_count, prices, disk/vcpu/memory bounds, stock_status, ...). Use this to pick a target before pod_quote, or to show the user options.

get_wallet_balanceA

Return the current Prime Intellect wallet balance and recent billings.

Use this to estimate how long a quoted pod can run, or to check why pod_create returned an insufficient-funds error.

pod_quoteA

Get a non-binding price quote + reserved provisioning payload.

Returns a quote_token (TTL=60s) that you pass to pod_create with confirm=True to actually provision. This tool has NO side effects.

The server picks the cheapest matching GPUAvailability row that satisfies the requested disk/vcpu/memory. If none matches, returns an error explaining what's available.

pod_createA

Provision a Prime Intellect GPU pod (or preview the provisioning).

With confirm=False: returns a dry-run preview describing what would happen. With confirm=True: validates spend caps + quote freshness, then provisions.

The server enforces:

  • quote_token must be fresh (TTL 60s)

  • hourly_usd ≤ PRIME_MAX_HOURLY_USD

  • hourly_usd × max_lifetime_hours ≤ PRIME_MAX_TOTAL_USD

  • estimated total ≤ wallet balance

On success, the pod is recorded in local state.json so pod_check_runaway can warn about overdue pods later.

pod_listA

List every pod the API key can see (active + provisioning + stopped).

pod_statusA

Get the current status (provisioning / active / failed) for a pod.

With wait_for_ssh=True, blocks (polls every 5s) until ssh_connection is available — that's when you can SSH in. Returns the SSH connection string in ssh_connection (e.g. "root@1.2.3.4 -p 22000"). Use it from your Bash tool: ssh -o StrictHostKeyChecking=no <ssh_connection> "<cmd>".

pod_terminateA

Destroy (terminate) a pod. Idempotent on already-deleted pods.

Without confirm=True, returns a no-op preview so you can re-read your decision.

pod_check_runawayA

Return locally-tracked pods that have run past max_lifetime_hours OR whose accumulated cost is approaching PRIME_MAX_TOTAL_USD.

Call this at the start of long-running sessions to catch forgotten pods.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kvrancic/prime-intellect-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server