Announcing API for calculating the cost of different AI models

We are excited to share a new API for calculating the cost of different conversational AI models. This API is free to use and it covers a wide range of LLM providers, including OpenAI, Anthropic, Google, Perplexity, Replicate, OpenRouter, and 30 others.

How to use the API

Use the following code to calculate the cost of a model:

fetch('https://glama.ai/api/cost-calculator/calculate-chat-cost', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'YOUR_API_KEY',
  },
  body: JSON.stringify({
    messages: [
      {
        content: 'Hello, world!',
        role: 'user',
      },
    ],
    model: 'gpt-4o',
  }),
})
  .then((response) => response.json())
  .then((data) => {
    // data.totalCost is the total cost of the chat in USD
    console.log(data.totalCost);
  });

curl -X POST https://glama.ai/api/cost-calculator/calculate-chat-cost \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "messages": [
      {
        "content": "Hello, world!",
        "role": "user"
      }
    ],
    "model": "gpt-4o"
  }' | jq .totalCost

We are planning to keep it simple, but if you identify any issues, or if you have any suggestions, please let us know by emailing frank@glama.ai.

What's the payload format

The request payload must describe the model and messages array. The messages array is an array of objects with the following shape:

{
  content: string;
  role: 'user' | 'assistant' | 'system';
}

How to obtain a list of supported models

https://glama.ai/api/cost-calculator/models

How to obtain the API key

To use the API, you need to sign up for a free account on glama.ai. Once you have signed up, you can get your API key from https://glama.ai/settings/gateway/api-keys.

The only reason for asking to use an API key is so we could prevent abuse of the API and have a way to inform you about any changes to the API.

How it works

The API pulls the latest prices of the models from litellm repository. We also extend the prices data obtained from litellm with our own data, e.g. if we become aware of recent price changes before they are made available upstream. The API then matches the model name against the prices datasets, tokenizes messages using the model's tokenizer, and uses decimal.js to calculate the cost based on the number of input and output tokens.

Accuracy

Internally, Glama uses the same API to calculate our own costs. Therefore, we are motivated to make sure the API is as accurate as possible. However, the API comes without any guarantees. If you find any inaccuracies, please let us know by emailing frank@glama.ai.

Why API and not a library

There are a few existing libraries that allow calculating the cost of different models (e.g. tokencost), but the associated datasets tend to go out of date quickly. Meanwhile, we needed something that we ourselves could rely on to calculate as accurate costs as possible. So we decided to build our own API and make it available to everyone.

Why is the API free

We are not in a business of APIs. However, our target audience are tech-savvy organizations. Ultimately, this is a promotion for our platform. We calculated the cost of running the API, and even if we were to serve millions of requests per month, it would be a small price to pay for the distribution it achieves.