Code Execution
Our built-in MCP server has gained the ability to execute code.
Track the latest updates and improvements to Glama
Our built-in MCP server has gained the ability to execute code.
You can now upload more images in a single conversation without performance issues.
We just added the ability to view MCP server usage over different time spans, like the last 24 hours or the last 7 days. It also includes a breakdown of tool usage over the same time period.
Our code editor now supports syntax highlighting.
We heard your feedback and rebuilt search from scratch.
What's new:
Lightning-fast results
Smarter, more relevant matches
See for yourself → https://glama.ai/search
We just launched MCP Inspector.
It is a fully spec-compliant MCP Inspector with an accompanying test server, but you can use it with just about any MCP server.
Other highlights:
No login required to use it
Supports oauth/bearer/headers auth
State persists in the URL (shareable!)
You can now upload PDF documents and have AI-powered conversations about their contents. Whether it's research papers, technical documentation, contracts, or reports—simply upload your PDF and start asking questions.
Glama now supports MCP sampling, allowing MCP servers to request LLM completions during tool execution.
What's new:
MCP tools can now request AI-generated content mid-execution
Human-in-the-loop approval flow — you control when LLM calls are made
View the server's request (messages, max tokens) before approving
Reject requests you don't want to process
How it works:
When an MCP server sends a sampling request, you'll see a prompt in the chat with the request details. Click Approve & Generate to call your configured LLM and return the response to the server, or Reject to decline.
This enables more sophisticated MCP tools that can leverage AI capabilities for dynamic content generation, multi-step reasoning, and intelligent data transformation — all while keeping you in control.
Long-running MCP tools can now report progress updates that display in the playground chat. When a tool sends progress notifications, a visual progress bar shows the current status (e.g., "3/5") alongside the tool call, giving users real-time feedback on task completion.
MCP tools can now return audio content that renders directly in the playground chat. Audio responses display as playable audio players with built-in playback controls, enabling tools like text-to-speech to deliver audio output inline.
MCP tools can now return image content that renders directly in the playground chat. Images are displayed inline, enabling tools like image generators to show visual output without requiring external viewers.
Upgraded infrastructure with new hardware and load balancing. Average response times reduced from 225ms to 152ms — a 32% improvement.
We now support MCP elicitations.
Elicitation enables servers to request specific information from users during interactions.
The web fetch tool allows Glama to retrieve full content from specified web pages and PDF documents.
Try this now: Type "@glama get the contents of the HN front page"
The web search tool gives Glama direct access to real-time web content, allowing it to answer questions with up-to-date information beyond its knowledge cutoff.
Try this now: Type "@glama What's the weather in NYC?"
We just added persistent volumes for MCPs. This makes us the first one-click MCP hosting solution with persistent volumes.
To get started, read the announcement post.
Added new model: claude-opus-4-5-20251101
claude-opus-4-5-20251101 is a premium model combining maximum intelligence with practical performance.
Over the past few days, we've been focused on improving Chat and Gateway performance. Our latest updates have reduced response times by up to 40%. We hope you enjoy the faster, smoother experience.
You asked, we listened.
Many of you told us it was hard to keep up with new capabilities in Glama. While discovering features organically has its charm, we wanted to make it easier to know what's possible. Release notes now show you exactly what we've added.