Skip to main content
Glama
sjquant
by sjquant

run_llm

Process prompts through various LLM models, including OpenAI, Anthropic, and Google, using customizable parameters like temperature and max tokens. Facilitates AI agent interactions with multiple providers via a unified interface.

Instructions

Run a prompt through an LLM and return the response.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
max_tokensNoMaximum number of tokens to generate
model_nameNoSpecific model name. Available models: anthropic:claude-3-7-sonnet-latest, anthropic:claude-3-5-haiku-latest, anthropic:claude-3-5-sonnet-latest, anthropic:claude-3-opus-latest, claude-3-7-sonnet-latest, claude-3-5-haiku-latest, bedrock:amazon.titan-tg1-large, bedrock:amazon.titan-text-lite-v1, bedrock:amazon.titan-text-express-v1, bedrock:us.amazon.nova-pro-v1:0, bedrock:us.amazon.nova-lite-v1:0, bedrock:us.amazon.nova-micro-v1:0, bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0, bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0, bedrock:anthropic.claude-3-5-haiku-20241022-v1:0, bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0, bedrock:anthropic.claude-instant-v1, bedrock:anthropic.claude-v2:1, bedrock:anthropic.claude-v2, bedrock:anthropic.claude-3-sonnet-20240229-v1:0, bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0, bedrock:anthropic.claude-3-haiku-20240307-v1:0, bedrock:us.anthropic.claude-3-haiku-20240307-v1:0, bedrock:anthropic.claude-3-opus-20240229-v1:0, bedrock:us.anthropic.claude-3-opus-20240229-v1:0, bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0, bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0, bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0, bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0, bedrock:cohere.command-text-v14, bedrock:cohere.command-r-v1:0, bedrock:cohere.command-r-plus-v1:0, bedrock:cohere.command-light-text-v14, bedrock:meta.llama3-8b-instruct-v1:0, bedrock:meta.llama3-70b-instruct-v1:0, bedrock:meta.llama3-1-8b-instruct-v1:0, bedrock:us.meta.llama3-1-8b-instruct-v1:0, bedrock:meta.llama3-1-70b-instruct-v1:0, bedrock:us.meta.llama3-1-70b-instruct-v1:0, bedrock:meta.llama3-1-405b-instruct-v1:0, bedrock:us.meta.llama3-2-11b-instruct-v1:0, bedrock:us.meta.llama3-2-90b-instruct-v1:0, bedrock:us.meta.llama3-2-1b-instruct-v1:0, bedrock:us.meta.llama3-2-3b-instruct-v1:0, bedrock:us.meta.llama3-3-70b-instruct-v1:0, bedrock:mistral.mistral-7b-instruct-v0:2, bedrock:mistral.mixtral-8x7b-instruct-v0:1, bedrock:mistral.mistral-large-2402-v1:0, bedrock:mistral.mistral-large-2407-v1:0, claude-3-5-sonnet-latest, claude-3-opus-latest, cohere:c4ai-aya-expanse-32b, cohere:c4ai-aya-expanse-8b, cohere:command, cohere:command-light, cohere:command-light-nightly, cohere:command-nightly, cohere:command-r, cohere:command-r-03-2024, cohere:command-r-08-2024, cohere:command-r-plus, cohere:command-r-plus-04-2024, cohere:command-r-plus-08-2024, cohere:command-r7b-12-2024, deepseek:deepseek-chat, deepseek:deepseek-reasoner, google-gla:gemini-1.0-pro, google-gla:gemini-1.5-flash, google-gla:gemini-1.5-flash-8b, google-gla:gemini-1.5-pro, google-gla:gemini-2.0-flash-exp, google-gla:gemini-2.0-flash-thinking-exp-01-21, google-gla:gemini-exp-1206, google-gla:gemini-2.0-flash, google-gla:gemini-2.0-flash-lite-preview-02-05, google-gla:gemini-2.0-pro-exp-02-05, google-vertex:gemini-1.0-pro, google-vertex:gemini-1.5-flash, google-vertex:gemini-1.5-flash-8b, google-vertex:gemini-1.5-pro, google-vertex:gemini-2.0-flash-exp, google-vertex:gemini-2.0-flash-thinking-exp-01-21, google-vertex:gemini-exp-1206, google-vertex:gemini-2.0-flash, google-vertex:gemini-2.0-flash-lite-preview-02-05, google-vertex:gemini-2.0-pro-exp-02-05, gpt-3.5-turbo, gpt-3.5-turbo-0125, gpt-3.5-turbo-0301, gpt-3.5-turbo-0613, gpt-3.5-turbo-1106, gpt-3.5-turbo-16k, gpt-3.5-turbo-16k-0613, gpt-4, gpt-4-0125-preview, gpt-4-0314, gpt-4-0613, gpt-4-1106-preview, gpt-4-32k, gpt-4-32k-0314, gpt-4-32k-0613, gpt-4-turbo, gpt-4-turbo-2024-04-09, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4.5-preview, gpt-4.5-preview-2025-02-27, gpt-4o, gpt-4o-2024-05-13, gpt-4o-2024-08-06, gpt-4o-2024-11-20, gpt-4o-audio-preview, gpt-4o-audio-preview-2024-10-01, gpt-4o-audio-preview-2024-12-17, gpt-4o-mini, gpt-4o-mini-2024-07-18, gpt-4o-mini-audio-preview, gpt-4o-mini-audio-preview-2024-12-17, groq:gemma2-9b-it, groq:llama-3.1-8b-instant, groq:llama-3.2-11b-vision-preview, groq:llama-3.2-1b-preview, groq:llama-3.2-3b-preview, groq:llama-3.2-90b-vision-preview, groq:llama-3.3-70b-specdec, groq:llama-3.3-70b-versatile, groq:llama3-70b-8192, groq:llama3-8b-8192, groq:mixtral-8x7b-32768, mistral:codestral-latest, mistral:mistral-large-latest, mistral:mistral-moderation-latest, mistral:mistral-small-latest, o1, o1-2024-12-17, o1-mini, o1-mini-2024-09-12, o1-preview, o1-preview-2024-09-12, o3-mini, o3-mini-2025-01-31, openai:chatgpt-4o-latest, openai:gpt-3.5-turbo, openai:gpt-3.5-turbo-0125, openai:gpt-3.5-turbo-0301, openai:gpt-3.5-turbo-0613, openai:gpt-3.5-turbo-1106, openai:gpt-3.5-turbo-16k, openai:gpt-3.5-turbo-16k-0613, openai:gpt-4, openai:gpt-4-0125-preview, openai:gpt-4-0314, openai:gpt-4-0613, openai:gpt-4-1106-preview, openai:gpt-4-32k, openai:gpt-4-32k-0314, openai:gpt-4-32k-0613, openai:gpt-4-turbo, openai:gpt-4-turbo-2024-04-09, openai:gpt-4-turbo-preview, openai:gpt-4-vision-preview, openai:gpt-4.5-preview, openai:gpt-4.5-preview-2025-02-27, openai:gpt-4o, openai:gpt-4o-2024-05-13, openai:gpt-4o-2024-08-06, openai:gpt-4o-2024-11-20, openai:gpt-4o-audio-preview, openai:gpt-4o-audio-preview-2024-10-01, openai:gpt-4o-audio-preview-2024-12-17, openai:gpt-4o-mini, openai:gpt-4o-mini-2024-07-18, openai:gpt-4o-mini-audio-preview, openai:gpt-4o-mini-audio-preview-2024-12-17, openai:o1, openai:o1-2024-12-17, openai:o1-mini, openai:o1-mini-2024-09-12, openai:o1-preview, openai:o1-preview-2024-09-12, openai:o3-mini, openai:o3-mini-2025-01-31, testopenai:gpt-4o-mini
promptYes
system_promptNoOptional system prompt to guide the model's behavior
temperatureNoControls randomness (0.0 to 1.0)

Implementation Reference

  • Registration of the 'run_llm' tool using the @mcp.tool() decorator.
    @mcp.tool()
  • The main handler function for the 'run_llm' tool. It creates an Agent from pydantic_ai, runs the prompt with specified settings, constructs an LLMResponse, and returns its JSON dump.
    async def run_llm(
        prompt: str,
        model_name: KnownModelName = Field(
            default="openai:gpt-4o-mini",
            description=f"Specific model name. Available models: {', '.join(KnownModelName.__args__)}",
        ),
        temperature: float = Field(
            default=0.7,
            description="Controls randomness (0.0 to 1.0)",
        ),
        max_tokens: int = Field(
            default=8192,
            description="Maximum number of tokens to generate",
        ),
        system_prompt: str = Field(
            default="",
            description="Optional system prompt to guide the model's behavior",
        ),
    ) -> str:
        """Run a prompt through an LLM and return the response."""
        agent = Agent(
            model=model_name,
            system_prompt=system_prompt,
        )
        response = await agent.run(
            prompt,
            model_settings={
                "temperature": temperature,
                "max_tokens": max_tokens,
            },
        )
        res = LLMResponse(
            content=response.data,
            model_name=model_name,
            usage=response.usage(),
            temperature=temperature,
        )
        return res.model_dump_json()
  • Pydantic schema/model for the LLM response used internally by the 'run_llm' handler.
    class LLMResponse(BaseModel):
        """Response from an LLM."""
    
        content: str
        model_name: str
        usage: Usage
        temperature: float
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sjquant/llm-bridge-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server