Glama supports prompt caching to optimize costs across supported providers and models. Here's how it works with different providers:
Cache effectiveness can be monitored via the analytics and logs dashboards or the /gateway/v1/completion-requests/:id
API.
cache_control
breakpointsNote: Cache pricing and features are subject to change. Check our API documentation for the most current information.