Glama now supports MCP sampling, allowing MCP servers to request LLM completions during tool execution.
What's new:
MCP tools can now request AI-generated content mid-execution
Human-in-the-loop approval flow — you control when LLM calls are made
View the server's request (messages, max tokens) before approving
Reject requests you don't want to process
How it works:
When an MCP server sends a sampling request, you'll see a prompt in the chat with the request details. Click Approve & Generate to call your configured LLM and return the response to the server, or Reject to decline.
This enables more sophisticated MCP tools that can leverage AI capabilities for dynamic content generation, multi-step reasoning, and intelligent data transformation — all while keeping you in control.
