vllm_status
Monitor vLLM server health and operational status to verify system functionality and detect issues.
Instructions
Check the health and status of the vLLM server
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- Main handler that formats and returns the vLLM server status as text
async def get_server_status_text() -> str: """ Get formatted server status as text. Returns: Formatted string with server status. """ status = await get_server_status() # Status emoji status_emoji = { "healthy": "✅", "unhealthy": "⚠️", "offline": "❌", "unknown": "❓", } emoji = status_emoji.get(status["status"], "❓") lines = [ f"## vLLM Server Status {emoji}", "", f"**Status:** {status['status']}", f"**Base URL:** {status['base_url']}", ] if status["models"]: lines.append(f"**Models:** {', '.join(status['models'])}") if status.get("error"): lines.append(f"**Error:** {status['error']}") if status.get("models_error"): lines.append(f"**Models Error:** {status['models_error']}") return "\n".join(lines) - Core logic that checks the vLLM server health and retrieves available models
async def get_server_status() -> dict[str, Any]: """ Get the current status of the vLLM server. Returns: Dictionary with server status information: - status: "healthy", "unhealthy", or "offline" - base_url: The configured vLLM base URL - models: List of available models (if healthy) - error: Error message (if any) """ result: dict[str, Any] = { "status": "unknown", "base_url": "", "models": [], "error": None, } try: async with VLLMClient() as client: result["base_url"] = client.settings.base_url # Check health health = await client.health_check() if health.get("status") == "healthy": result["status"] = "healthy" # Get available models try: models = await client.list_models() result["models"] = [m.get("id", "unknown") for m in models] except VLLMClientError as e: result["models_error"] = str(e) else: result["status"] = "unhealthy" result["error"] = f"Server returned status code: {health.get('code')}" except VLLMClientError as e: result["status"] = "offline" result["error"] = str(e) return result - src/vllm_mcp_server/server.py:154-161 (registration)Tool registration defining the vllm_status tool with its schema
Tool( name="vllm_status", description="Check the health and status of the vLLM server", inputSchema={ "type": "object", "properties": {}, }, ), - src/vllm_mcp_server/server.py:350-352 (registration)Tool handler routing in call_tool function that calls get_server_status_text()
elif name == "vllm_status": status_text = await get_server_status_text() return [TextContent(type="text", text=status_text)] - Helper method that performs the actual health check HTTP request to the vLLM server
async def health_check(self) -> dict[str, Any]: """Check if the vLLM server is healthy.""" session = await self._get_session() try: async with session.get( f"{self.settings.base_url}/health", headers=self.headers, ) as response: if response.status == 200: return {"status": "healthy", "code": 200} return {"status": "unhealthy", "code": response.status} except aiohttp.ClientConnectorError as e: raise VLLMConnectionError(f"Cannot connect to vLLM server: {e}") from e except asyncio.TimeoutError as e: raise VLLMConnectionError("Connection to vLLM server timed out") from e