mcpstat

Overview Schema Related Servers Score Discussions

mcpstat
docs

token-tracking.md•5.16 KiB

# Token Tracking Track response sizes and estimate token usage for cost analysis. --- ## Overview MCP servers don't have direct access to LLM token counts - tokens are returned to the client, not the server. mcpstat provides both: 1. **Server-side estimation** - Estimate tokens from response character count 2. **Client-side injection** - Report actual tokens from LLM API responses --- ## Basic Usage (Estimation) Track response sizes for automatic token estimation: ```python @app.call_tool() async def handle_tool(name: str, arguments: dict): result = await my_logic(arguments) # Record with response size for token estimation await stat.record( name, "tool", response_chars=len(str(result)) ) return result ``` mcpstat estimates tokens using ~3.5 characters per token (conservative for mixed content). --- ## Actual Token Tracking If you have access to actual token counts from your LLM provider: ### Method 1: Record with Tokens ```python await stat.record( name, "tool", input_tokens=100, output_tokens=250 ) ``` ### Method 2: Deferred Reporting ```python # Record the call first await stat.record("my_tool", "tool") # Later, when tokens are available response = await anthropic.messages.create(...) await stat.report_tokens( "my_tool", response.usage.input_tokens, response.usage.output_tokens ) ``` --- ## Token Statistics `get_stats()` includes comprehensive token information: ```python stats = await stat.get_stats() ``` ### Response Structure ```python { "token_summary": { "total_input_tokens": 5000, # Sum across all tools "total_output_tokens": 12000, # Sum across all tools "total_estimated_tokens": 3500, # From response_chars "has_actual_tokens": True # True if any actual tokens recorded }, "stats": [ { "name": "my_tool", "call_count": 10, "total_input_tokens": 1000, "total_output_tokens": 2500, "total_response_chars": 8000, "estimated_tokens": 2286, "avg_tokens_per_call": 350, # (input + output) / calls ... } ] } ``` ### Token Fields | Field | Description | |-------|-------------| | `total_input_tokens` | Cumulative input tokens (if tracked) | | `total_output_tokens` | Cumulative output tokens (if tracked) | | `total_response_chars` | Cumulative response characters | | `estimated_tokens` | Tokens estimated from response size | | `avg_tokens_per_call` | Average tokens per invocation | --- ## Estimation vs. Actual mcpstat prioritizes actual tokens over estimates: ```python # Priority for avg_tokens_per_call: if total_input_tokens + total_output_tokens > 0: avg = (input + output) / call_count # Use actual else: avg = estimated_tokens / call_count # Fall back to estimate ``` --- ## Use Cases ### Cost Analysis Track token usage to estimate API costs: ```python stats = await stat.get_stats() summary = stats["token_summary"] total_tokens = summary["total_input_tokens"] + summary["total_output_tokens"] estimated_cost = total_tokens * 0.00001 # Example rate print(f"Estimated cost: ${estimated_cost:.4f}") ``` ### Identifying Token-Heavy Tools Find tools that consume the most tokens: ```python stats = await stat.get_stats() # Sort by total tokens by_tokens = sorted( stats["stats"], key=lambda s: s["total_input_tokens"] + s["total_output_tokens"], reverse=True ) for tool in by_tokens[:5]: total = tool["total_input_tokens"] + tool["total_output_tokens"] print(f"{tool['name']}: {total} tokens") ``` ### Optimization Recommendations ```python stats = await stat.get_stats() for tool in stats["stats"]: avg = tool["avg_tokens_per_call"] if avg > 1000: print(f"⚠️ {tool['name']}: {avg} avg tokens/call - consider optimization") ``` --- ## Database Schema Token tracking adds these columns to `mcpstat_usage`: | Column | Type | Description | |--------|------|-------------| | `total_input_tokens` | INTEGER | Cumulative input tokens | | `total_output_tokens` | INTEGER | Cumulative output tokens | | `total_response_chars` | INTEGER | Cumulative response characters | | `estimated_tokens` | INTEGER | Tokens estimated from response size | --- ## Migration > **Since v0.2.1**: Token tracking columns were added in version 0.2.1. Existing databases are automatically migrated to include these columns. All existing data is preserved, and new columns default to `0`. --- ## Best Practices ### 1. Track Response Sizes Even without actual tokens, tracking response sizes provides useful estimates: ```python await stat.record( name, "tool", response_chars=len(json.dumps(result)) ) ``` ### 2. Use Deferred Reporting for Accuracy When actual tokens are available, use `report_tokens()`: ```python # In your client code response = await client.messages.create(...) await stat.report_tokens( tool_name, response.usage.input_tokens, response.usage.output_tokens ) ``` ### 3. Monitor High-Token Tools Regularly check for tools with high average token usage: ```python for tool in stats["stats"]: if tool["avg_tokens_per_call"] > 500: print(f"Review: {tool['name']}") ```

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tekkidev/mcpstat'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

token-tracking.md•5.16 KiB