Which integrations are available for this server?

Enables cost tracking and quality-based routing for Google's Gemini models through API integration and budget monitoring. Integrates with local Ollama instances to detect available models, assess routing intelligence, and compare local inference performance with cloud providers. Provides real-time spend tracking, usage logging, and cost optimization for OpenAI models like GPT-4o and o1.

How do I use ComputeGauge MCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@ComputeGauge MCP show me my total AI spend and current budget status" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

ComputeGauge MCP

Official

by ComputeGauge

Overview Schema Related Servers Score Discussions

TypeScript

Hybrid

@computegauge/mcp

The cost intelligence + agent credibility layer for AI agents. Install once, every session is cost-aware and credibility-building.

npm install -g @computegauge/mcp

Why This Exists

AI agents are powerful but expensive. A single Claude Code session can cost $5-50+. A multi-agent workflow can burn through $100 in minutes. And nobody tells the agent — or the human — until the bill arrives.

ComputeGauge MCP makes agents cost-aware AND gives them a credibility score. Any MCP-compatible agent (Claude, Cursor, Windsurf, custom agents) gets:

Cost Intelligence — Know which model is optimal for every task, track spend in real-time
Agent Credibility — Build a reputation score (0-1000) by making smart decisions. Compete on a leaderboard.
Local→Cloud Routing — Detect when local inference isn't good enough, route to cloud, earn credibility for smart routing

The result: agents that spend 40-70% less, build visible credibility, and know when to route to cloud.

Setup — 30 Seconds

Claude Desktop / Claude Code

Add to ~/.config/claude/claude_desktop_config.json:

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"]
    }
  }
}

That's it. Restart Claude. Every conversation now has cost intelligence + credibility tracking.

With Provider API Keys (Enhanced)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY": "sk-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

With Local Inference (Ollama, vLLM, etc.)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "OLLAMA_MODELS": "llama3.3:70b,qwen2.5:7b,deepseek-r1:14b",
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

Cursor

Add to Cursor MCP settings:

{
  "computegauge": {
    "command": "npx",
    "args": ["-y", "@computegauge/mcp"]
  }
}

Tools Reference

Agent-Native Tools (use automatically every session)

Tool	When to Call	What It Does	Credibility
`pick_model`	Before any API request	Returns the optimal model for a task	+8 Routing Intelligence
`log_request`	After any API request	Logs the request cost	+3 Honest Reporting
`session_cost`	Every 5-10 requests	Shows cumulative cost and budget	—
`rate_recommendation`	After completing a task	Rate how well the model performed	+5 Quality Contribution
`model_ratings`	When curious about quality	View model quality leaderboard	—
`improvement_cycle`	At session end	Run continuous improvement engine	+15 Quality Contribution
`integrity_report`	For transparency	View rating acceptance/rejection stats	—

Credibility Tools (the reputation protocol)

Tool	When to Call	What It Does	Credibility
`credibility_profile`	Anytime	View your 0-1000 credibility score, tier, badges	—
`credibility_leaderboard`	To compete	See how you rank vs other agents	—
`route_to_cloud`	After local→cloud routing	Report smart routing decision	+70 Cloud Routing
`assess_routing`	Before choosing local vs cloud	Should this task stay local?	—
`cluster_status`	To check local capabilities	View local endpoints, models, hardware	—

Intelligence Tools (for user questions)

Tool	Description
`get_spend_summary`	User's total AI spend across all providers
`get_budget_status`	Budget utilization and alerts
`get_model_pricing`	Current pricing for any model
`get_cost_comparison`	Compare costs for specific workloads
`suggest_savings`	Actionable cost optimization recommendations
`get_usage_trend`	Spend trends and anomaly detection

Resources

Resource	URI	Description
Config	`computegauge://config`	Current server configuration
Session	`computegauge://session`	Real-time session cost data
Ratings	`computegauge://ratings`	Model quality leaderboard
Credibility	`computegauge://credibility`	Agent credibility profile + leaderboard
Cluster	`computegauge://cluster`	Local inference cluster status
Quickstart	`computegauge://quickstart`	Agent onboarding guide

Prompts

Prompt	Description
`cost_aware_system`	System prompt that makes any agent cost-aware + credibility-building
`daily_cost_report`	Generate a quick daily cost report
`optimize_workflow`	Analyze and optimize a described AI workflow

Agent Credibility System

Every smart decision earns credibility points on a 0-1000 scale:

Category	How to Earn	Points
🧠 Routing Intelligence	Using `pick_model` wisely, avoiding overspec	+8 to +15 per event
💰 Cost Efficiency	Staying under budget, significant savings	+5 to +30 per event
✅ Task Success	Completing tasks successfully	+10 to +25 per event
📊 Honest Reporting	Logging requests, reporting failures honestly	+3 to +10 per event
☁️ Cloud Routing	Smart local→cloud routing via ComputeGauge	+25 to +70 per event
⭐ Quality Contribution	Rating models, running improvement cycles	+5 to +15 per event

Credibility Tiers

Tier	Score	What It Means
⚪ Unrated	0-99	Just getting started
🥉 Bronze	100-299	Learning the ropes
🥈 Silver	300-499	Competent and cost-aware
🥇 Gold	500-699	Skilled optimizer
💎 Platinum	700-849	Elite decision-maker
👑 Diamond	850-1000	Best in class

Earnable Badges

Badge	How to Earn
🌱 First Steps	Complete first session
💰 Cost Optimizer	Save >$10 through smart model selection
📊 Transparency Champion	Log 50+ requests accurately
☁️ Smart Router	Successfully route 10+ tasks to cloud
⭐ Quality Pioneer	Submit 25+ model ratings
🔥 Streak Master	20+ consecutive successful tasks
🥇 Gold Agent	Reach Gold tier (500+ score)
💎 Platinum Agent	Reach Platinum tier (700+ score)
👑 Diamond Agent	Reach Diamond tier (850+ score)
🌐 Hybrid Intelligence	Use both local and cloud models in one session

Local Cluster Integration

ComputeGauge auto-detects local inference endpoints:

Platform	Environment Variable	Default
Ollama	`OLLAMA_HOST`	`http://localhost:11434`
vLLM	`VLLM_HOST`	—
llama.cpp	`LLAMACPP_HOST`	—
TGI	`TGI_HOST`	—
LocalAI	`LOCALAI_HOST`	—
Custom	`LOCAL_LLM_ENDPOINT`	—

Set OLLAMA_MODELS="llama3.3:70b,qwen2.5:7b" (comma-separated) to declare available models.

The Local→Cloud Routing Flow

1. Agent calls assess_routing("code_generation", quality="good")
2. ComputeGauge checks: local llama3.3:70b quality for code_generation = 80/100
3. "Good" quality threshold = 78 → Local model is sufficient!
4. Agent uses local model → saves money → earns credibility for honest assessment

OR:

1. Agent calls assess_routing("complex_reasoning", quality="excellent")
2. ComputeGauge checks: local llama3.3:70b quality for complex_reasoning = 78/100
3. "Excellent" quality threshold = 88 → Quality gap of 10 points → Route to cloud!
4. Agent calls pick_model → gets Claude Sonnet 4 → executes → calls route_to_cloud
5. Agent earns +70 credibility points for smart routing decision

How `pick_model` Works

The decision engine scores every model across three dimensions:

Quality — Per-task-type scores for 14 task types Cost — Real pricing from 8 providers, 20+ models, calculated per-call (log-scale normalization) Speed — Relative inference speed scores

Priority	Quality	Cost	Speed
`cheapest`	20%	70%	10%
`balanced`	45%	35%	20%
`best_quality`	70%	10%	20%
`fastest`	25%	15%	60%

Model Coverage

Provider	Models	Tier Range
Anthropic	Claude Opus 4, Sonnet 4, Sonnet 3.5, Haiku 3.5	Frontier → Budget
OpenAI	o1, GPT-4o, o3-mini, GPT-4o-mini	Frontier → Budget
Google	Gemini 2.0 Pro, 1.5 Pro, 2.0 Flash	Premium → Budget
DeepSeek	Reasoner, Chat	Value → Budget
Groq	Llama 3.3 70B, Llama 3.1 8B	Value → Budget
Together	Llama 3.3 70B Turbo, Qwen 2.5 72B	Value
Mistral	Large, Small	Premium → Budget

Local Models Supported

Model	Quality (general)	Best For
llama3.3:70b	79/100	General tasks, code
qwen2.5:72b	81/100	Code, math, translation
deepseek-r1:70b	80/100	Reasoning, math, code
deepseek-r1:14b	68/100	Budget reasoning
phi3:14b	60/100	Simple tasks
llama3.1:8b	58/100	Classification, simple QA
mistral:7b	58/100	Simple tasks

Environment Variables

Variable	Required	Description
`COMPUTEGAUGE_DASHBOARD_URL`	No	URL of ComputeGauge dashboard
`COMPUTEGAUGE_API_KEY`	No	API key for dashboard access
`COMPUTEGAUGE_BUDGET_TOTAL`	No	Session budget limit in USD
`COMPUTEGAUGE_BUDGET_ANTHROPIC`	No	Per-provider monthly budget
`COMPUTEGAUGE_BUDGET_OPENAI`	No	Per-provider monthly budget
`ANTHROPIC_API_KEY`	No	Enables Anthropic provider detection
`OPENAI_API_KEY`	No	Enables OpenAI provider detection
`GOOGLE_API_KEY`	No	Enables Google provider detection
`OLLAMA_HOST`	No	Ollama inference endpoint
`OLLAMA_MODELS`	No	Comma-separated local model names
`VLLM_HOST`	No	vLLM inference endpoint
`COMPUTEGAUGE_GPU`	No	GPU name for hardware detection
`COMPUTEGAUGE_VRAM_GB`	No	VRAM in GB
`COMPUTEGAUGE_COST_PER_HOUR`	No	Amortized hardware cost/hr

For Agent Developers

If you're building AI agents (via Claude Agent SDK, LangChain, CrewAI, AutoGen, etc.), ComputeGauge MCP is the easiest way to add cost awareness AND agent credibility:

Zero integration effort — Just add the MCP server to your agent's config
No code changes — The agent discovers 18 tools via MCP protocol automatically
Immediate value — pick_model returns recommendations on first call, credibility tracking starts automatically
Session tracking built-in — Full cost visibility per agent run
Credibility system — Your agent earns a visible reputation score that users can see
Local cluster support — Auto-detect and leverage on-prem inference
Budget guardrails — Warnings when approaching limits

Pattern: Cost-Aware + Credibility-Building Agent Loop

1. Agent receives task
2. Agent calls assess_routing(task_type) → local or cloud?
3. Agent calls pick_model(task_type, priority="balanced")
4. Agent uses recommended model for the task
5. Agent calls log_request(provider, model, tokens)
6. Agent calls rate_recommendation(model, rating, success)
7. If cloud-routed: agent calls route_to_cloud(task_type, reason, model)
8. Every 5 requests, agent calls session_cost()
9. If session cost > 80% of budget, switch to priority="cheapest"
10. At session end: check credibility_profile()

This pattern reduces costs by 40-70% while building a credibility score that makes users trust the agent more.

License

Apache-2.0 — Free to use, modify, and distribute.

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ComputeGauge/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ComputeGauge MCP

@computegauge/mcp

Why This Exists

Setup — 30 Seconds

Claude Desktop / Claude Code

With Provider API Keys (Enhanced)

With Local Inference (Ollama, vLLM, etc.)

Cursor

Tools Reference

Agent-Native Tools (use automatically every session)

Credibility Tools (the reputation protocol)

Intelligence Tools (for user questions)

Resources

Prompts

Agent Credibility System

Credibility Tiers

Earnable Badges

Local Cluster Integration

The Local→Cloud Routing Flow

How `pick_model` Works

Model Coverage

Local Models Supported

Environment Variables

For Agent Developers

Pattern: Cost-Aware + Credibility-Building Agent Loop

License

Links

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API

@computegauge/mcp

Why This Exists

Setup — 30 Seconds

Claude Desktop / Claude Code

With Provider API Keys (Enhanced)

With Local Inference (Ollama, vLLM, etc.)

Cursor

Tools Reference

Agent-Native Tools (use automatically every session)

Credibility Tools (the reputation protocol)

Intelligence Tools (for user questions)

Resources

Prompts

Agent Credibility System

Credibility Tiers

Earnable Badges

Local Cluster Integration

The Local→Cloud Routing Flow

How pick_model Works

Model Coverage

Local Models Supported

Environment Variables

For Agent Developers

Pattern: Cost-Aware + Credibility-Building Agent Loop

License

Links

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API

How `pick_model` Works