Skip to main content
Glama

execute_prompt_with_llm

Execute prompts with LLMs by retrieving prompts from MCP servers, filling template variables, and returning responses with metadata for testing workflows.

Instructions

Execute a prompt with an LLM and return the response.

This tool performs the complete workflow:

  1. Retrieves the prompt from the connected MCP server with prompt_arguments

  2. Optionally fills template variables in the prompt messages

  3. Sends the prompt messages to an LLM

  4. Returns the LLM's response along with metadata

Supports two prompt patterns:

  • Standard MCP prompts: Pass arguments via prompt_arguments, server handles substitution

  • Template variables: Use fill_variables to replace {variable} placeholders in messages

Args: prompt_name: Name of the prompt to execute prompt_arguments: Dictionary of arguments to pass to the MCP prompt (default: {}) fill_variables: Dictionary of template variables to fill in prompt messages (default: None) Used for manual string replacement of {variable_name} patterns. Values are JSON-serialized before substitution if they're not strings. llm_config: Optional LLM configuration with keys: - url: LLM endpoint URL (default: from LLM_URL env var) - model: Model name (default: from LLM_MODEL_NAME env var) - api_key: API key (default: from LLM_API_KEY env var) - max_tokens: Maximum tokens in response (default: 1000) - temperature: Sampling temperature (default: 0.7)

Returns: Dictionary with execution results including: - success: True if execution succeeded - prompt: Original prompt information - llm_request: The request sent to the LLM - llm_response: The LLM's response - parsed_response: Attempted JSON parsing if response looks like JSON - metadata: Timing and configuration information

Raises: Returns error dict for various failure scenarios: - not_connected: No active MCP connection - prompt_not_found: Prompt doesn't exist - llm_config_error: Missing or invalid LLM configuration - llm_request_error: LLM request failed

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
prompt_nameYesName of the prompt to execute
prompt_argumentsNoArguments to pass to the MCP prompt (JSON object or string)
fill_variablesNoTemplate variables to fill in prompt messages (JSON object or string)
llm_configNoLLM configuration (url, model, api_key, etc.)

Implementation Reference

  • The core handler function for the 'execute_prompt_with_llm' tool. Decorated with @mcp.tool, it retrieves prompts from MCP server, fills variables, calls LLM API, and returns structured response with error handling.
    @mcp.tool async def execute_prompt_with_llm( prompt_name: Annotated[str, "Name of the prompt to execute"], ctx: Context, prompt_arguments: Annotated[dict[str, Any] | str | None, "Arguments to pass to the MCP prompt (JSON object or string)"] = None, fill_variables: Annotated[dict[str, Any] | str | None, "Template variables to fill in prompt messages (JSON object or string)"] = None, llm_config: Annotated[dict[str, Any] | str | None, "LLM configuration (url, model, api_key, etc.)"] = None ) -> dict[str, Any]: """Execute a prompt with an LLM and return the response. This tool performs the complete workflow: 1. Retrieves the prompt from the connected MCP server with prompt_arguments 2. Optionally fills template variables in the prompt messages 3. Sends the prompt messages to an LLM 4. Returns the LLM's response along with metadata Supports two prompt patterns: - Standard MCP prompts: Pass arguments via prompt_arguments, server handles substitution - Template variables: Use fill_variables to replace {variable} placeholders in messages Args: prompt_name: Name of the prompt to execute prompt_arguments: Dictionary of arguments to pass to the MCP prompt (default: {}) fill_variables: Dictionary of template variables to fill in prompt messages (default: None) Used for manual string replacement of {variable_name} patterns. Values are JSON-serialized before substitution if they're not strings. llm_config: Optional LLM configuration with keys: - url: LLM endpoint URL (default: from LLM_URL env var) - model: Model name (default: from LLM_MODEL_NAME env var) - api_key: API key (default: from LLM_API_KEY env var) - max_tokens: Maximum tokens in response (default: 1000) - temperature: Sampling temperature (default: 0.7) Returns: Dictionary with execution results including: - success: True if execution succeeded - prompt: Original prompt information - llm_request: The request sent to the LLM - llm_response: The LLM's response - parsed_response: Attempted JSON parsing if response looks like JSON - metadata: Timing and configuration information Raises: Returns error dict for various failure scenarios: - not_connected: No active MCP connection - prompt_not_found: Prompt doesn't exist - llm_config_error: Missing or invalid LLM configuration - llm_request_error: LLM request failed """ start_time = time.perf_counter() try: # Parse JSON string parameters if needed if isinstance(prompt_arguments, str): try: prompt_arguments = json.loads(prompt_arguments) except json.JSONDecodeError as e: return { "success": False, "error": { "error_type": "invalid_arguments", "message": f"prompt_arguments is not valid JSON: {str(e)}", "details": {"raw_value": prompt_arguments[:200]}, "suggestion": "Provide a valid JSON object or dictionary", }, "metadata": {"request_time_ms": 0}, } if isinstance(fill_variables, str): try: fill_variables = json.loads(fill_variables) except json.JSONDecodeError as e: return { "success": False, "error": { "error_type": "invalid_arguments", "message": f"fill_variables is not valid JSON: {str(e)}", "details": {"raw_value": fill_variables[:200]}, "suggestion": "Provide a valid JSON object or dictionary", }, "metadata": {"request_time_ms": 0}, } if isinstance(llm_config, str): try: llm_config = json.loads(llm_config) except json.JSONDecodeError as e: return { "success": False, "error": { "error_type": "invalid_arguments", "message": f"llm_config is not valid JSON: {str(e)}", "details": {"raw_value": llm_config[:200]}, "suggestion": "Provide a valid JSON object or dictionary", }, "metadata": {"request_time_ms": 0}, } # Set default for prompt_arguments if prompt_arguments is None: prompt_arguments = {} # Verify connection exists client, state = ConnectionManager.require_connection() # User-facing progress update await ctx.info(f"Executing prompt '{prompt_name}' with LLM") # Detailed technical log logger.info( f"Executing prompt '{prompt_name}' with LLM", extra={ "prompt_name": prompt_name, "arguments": prompt_arguments, "has_fill_variables": fill_variables is not None, }, ) # Get the prompt from the MCP server prompt_start = time.perf_counter() result = await client.get_prompt(prompt_name, prompt_arguments) prompt_elapsed_ms = (time.perf_counter() - prompt_start) * 1000 # Extract messages messages: list[dict[str, Any]] = [] if hasattr(result, "messages") and result.messages: for message in result.messages: msg_dict: dict[str, Any] = {"role": message.role} # Extract content if hasattr(message, "content"): content = message.content if hasattr(content, "text"): msg_dict["content"] = content.text elif ( hasattr(content, "type") and content.type == "text" and hasattr(content, "text") ): msg_dict["content"] = content.text else: msg_dict["content"] = str(content) messages.append(msg_dict) # Fill template variables if provided if fill_variables: logger.debug(f"Filling template variables: {list(fill_variables.keys())}") for msg in messages: if "content" in msg and isinstance(msg["content"], str): content_str = msg["content"] # Fill each variable for var_name, var_value in fill_variables.items(): placeholder = "{" + var_name + "}" # Convert value to string (JSON serialize if not a string) if isinstance(var_value, str): replacement = var_value else: replacement = json.dumps(var_value, indent=2) content_str = content_str.replace(placeholder, replacement) msg["content"] = content_str # Get LLM configuration if llm_config is None: llm_config = {} llm_url = llm_config.get("url") or os.getenv("LLM_URL") llm_model = llm_config.get("model") or os.getenv("LLM_MODEL_NAME") llm_api_key = llm_config.get("api_key") or os.getenv("LLM_API_KEY") max_tokens = llm_config.get("max_tokens", 1000) temperature = llm_config.get("temperature", 0.7) if not all([llm_url, llm_model, llm_api_key]): return { "success": False, "error": { "error_type": "llm_config_error", "message": "Missing LLM configuration. Provide llm_config or set LLM_URL, LLM_MODEL_NAME, and LLM_API_KEY environment variables", "details": { "has_url": bool(llm_url), "has_model": bool(llm_model), "has_api_key": bool(llm_api_key), }, "suggestion": "Set LLM_URL, LLM_MODEL_NAME, and LLM_API_KEY in your .env file", }, "metadata": { "request_time_ms": round((time.perf_counter() - start_time) * 1000, 2), }, } # Prepare LLM request llm_request = { "model": llm_model, "messages": messages, "max_tokens": max_tokens, "temperature": temperature, } # User-facing progress update await ctx.info(f"Sending request to LLM endpoint: {llm_url}") # Send to LLM llm_start = time.perf_counter() async with httpx.AsyncClient(timeout=60.0) as http_client: response = await http_client.post( f"{llm_url}/chat/completions", headers={ "Content-Type": "application/json", "Authorization": f"Bearer {llm_api_key}", }, json=llm_request, ) llm_elapsed_ms = (time.perf_counter() - llm_start) * 1000 total_elapsed_ms = (time.perf_counter() - start_time) * 1000 if response.status_code != 200: logger.error( f"LLM request failed with status {response.status_code}", extra={ "status_code": response.status_code, "response_text": response.text[:500], }, ) return { "success": False, "error": { "error_type": "llm_request_error", "message": f"LLM request failed with status {response.status_code}", "details": { "status_code": response.status_code, "response_text": response.text[:500], }, "suggestion": "Check LLM endpoint configuration and API key", }, "metadata": { "request_time_ms": round(total_elapsed_ms, 2), }, } # Parse LLM response llm_result = response.json() llm_response_text = llm_result["choices"][0]["message"]["content"] # Try to extract and parse JSON if present parsed_response = None json_match = re.search(r"```json\s*(.*?)\s*```", llm_response_text, re.DOTALL) if json_match: try: parsed_response = json.loads(json_match.group(1)) except json.JSONDecodeError as e: logger.warning(f"Failed to parse extracted JSON: {e}") elif llm_response_text.strip().startswith("{"): try: parsed_response = json.loads(llm_response_text) except json.JSONDecodeError: pass # Not valid JSON, leave as None # User-facing success update await ctx.info(f"Prompt '{prompt_name}' executed successfully with LLM") # Detailed technical log logger.info( f"Prompt '{prompt_name}' executed successfully with LLM", extra={ "prompt_name": prompt_name, "prompt_ms": prompt_elapsed_ms, "llm_ms": llm_elapsed_ms, "total_ms": total_elapsed_ms, }, ) return { "success": True, "prompt": { "name": prompt_name, "arguments": prompt_arguments, "message_count": len(messages), }, "llm_request": llm_request, "llm_response": { "text": llm_response_text, "usage": llm_result.get("usage", {}), "model": llm_result.get("model"), }, "parsed_response": parsed_response, "metadata": { "prompt_retrieval_ms": round(prompt_elapsed_ms, 2), "llm_execution_ms": round(llm_elapsed_ms, 2), "total_time_ms": round(total_elapsed_ms, 2), "server_url": state.server_url, "llm_endpoint": llm_url, "llm_model": llm_model, }, } except ConnectionError as e: elapsed_ms = (time.perf_counter() - start_time) * 1000 # User-facing error update await ctx.error(f"Not connected when executing prompt '{prompt_name}': {str(e)}") # Detailed technical log logger.error( f"Not connected when executing prompt '{prompt_name}': {str(e)}", extra={"prompt_name": prompt_name, "duration_ms": elapsed_ms}, ) return { "success": False, "error": { "error_type": "not_connected", "message": str(e), "details": {"prompt_name": prompt_name}, "suggestion": "Use connect_to_server() to establish a connection first", }, "metadata": { "request_time_ms": round(elapsed_ms, 2), }, } except Exception as e: elapsed_ms = (time.perf_counter() - start_time) * 1000 # Determine error type error_type = "execution_error" suggestion = "Check the prompt name, arguments, and LLM configuration" error_msg = str(e).lower() if "not found" in error_msg or "unknown prompt" in error_msg: error_type = "prompt_not_found" suggestion = f"Prompt '{prompt_name}' does not exist on the server" elif "timeout" in error_msg or "connection" in error_msg: error_type = "llm_request_error" suggestion = "LLM request timed out or connection failed" # User-facing error update await ctx.error(f"Failed to execute prompt '{prompt_name}' with LLM: {str(e)}") # Detailed technical log logger.error( f"Failed to execute prompt '{prompt_name}' with LLM: {str(e)}", extra={ "prompt_name": prompt_name, "error_type": error_type, "duration_ms": elapsed_ms, }, ) ConnectionManager.increment_stat("errors") return { "success": False, "error": { "error_type": error_type, "message": f"Failed to execute prompt with LLM: {str(e)}", "details": { "prompt_name": prompt_name, "exception_type": type(e).__name__, }, "suggestion": suggestion, }, "metadata": { "request_time_ms": round(elapsed_ms, 2), }, }
  • Import statement in the main server.py that loads the llm module, triggering automatic tool registration via the @mcp.tool decorator.
    from .tools import connection, tools, resources, prompts, llm
  • Debug log listing registered tools, confirming 'execute_prompt_with_llm' is registered.
    "execute_prompt_with_llm" ]
  • Function signature with Annotated types defining the input schema for the tool parameters.
    async def execute_prompt_with_llm( prompt_name: Annotated[str, "Name of the prompt to execute"], ctx: Context, prompt_arguments: Annotated[dict[str, Any] | str | None, "Arguments to pass to the MCP prompt (JSON object or string)"] = None, fill_variables: Annotated[dict[str, Any] | str | None, "Template variables to fill in prompt messages (JSON object or string)"] = None, llm_config: Annotated[dict[str, Any] | str | None, "LLM configuration (url, model, api_key, etc.)"] = None ) -> dict[str, Any]:

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rdwj/mcp-test-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server