Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
MODELCOSTSAVER_LEDGERNoEnable the local record_usage write. Default is 'off'.
MODELCOSTSAVER_REFRESHNoEnable the opt-in remote catalog refresh. Default is 'off'.
MODELCOSTSAVER_PROVIDERNoBias select_optimal_model toward a specific provider. Default is none.
MODELCOSTSAVER_PROVIDERSNoAllowlist for recommendations (e.g., 'anthropic,openai'). Default is client-derived.
MODELCOSTSAVER_TELEMETRYNoTelemetry setting (kept off for transparency). Default is 'off'.
MODELCOSTSAVER_FAST_MODELNoPin a preferred model for the fast tier. Default is catalog cheapest.
MODELCOSTSAVER_CATALOG_URLNoOverride the refresh source URL. Default is bundled.
MODELCOSTSAVER_INCLUDE_LOCALNoSurface self-hosted / $0 models. Default is 'off'.
MODELCOSTSAVER_TRIVIAL_MODELNoPin a preferred model for the trivial tier. Default is catalog cheapest.
MODELCOSTSAVER_STANDARD_MODELNoPin a preferred model for the standard tier. Default is catalog cheapest.
MODELCOSTSAVER_CHARS_PER_TOKENNoTune the token estimator. Default is '4'.
MODELCOSTSAVER_REASONING_MODELNoPin a preferred model for the reasoning tier. Default is catalog cheapest.

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
prompts
{
  "listChanged": true
}
resources
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
estimate_costA

Estimate the cost of a single LLM call for one model from known or estimated token counts. Offline, no keys.

predict_costA

Forecast the cost of a prompt across candidate models before the call. Returns a cheapest-first ranking with assumptions. Offline.

select_optimal_modelB

Pick the single cheapest model that meets the task tier, capabilities, and budget, with full reasoning and a fallbackChain. Offline.

compare_modelsB

Compare models side by side for a fixed token shape, cheapest first, with the multiple of the cheapest. Offline.

list_modelsB

Return the model catalog with pricing, optionally filtered. capabilities are arrays. Offline.

get_pricingC

Return the model catalog with pricing, optionally filtered. capabilities are arrays. Offline.

optimize_requestA

Check whether a cheaper capable model exists for a call you plan to make, and report the savings. Offline.

Prompts

Interactive templates invoked by user choice

NameDescription
modelcostsaverHow and when to use ModelCostSaver: forecast cost and pick the cheapest capable model before an LLM call.
modelcostsaver-setupSelf-configure ModelCostSaver in this client. Hand this to your agent to set it up.

Resources

Contextual data attached and managed by the client

NameDescription
catalogThe full model catalog with per-token pricing, tiers, and capabilities. Offline, versioned.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sachinuppal/modelcostsaver'

If you have feedback or need assistance with the MCP directory API, please join our Discord server