Provides tools for querying Kubernetes clusters to retrieve GPU node information, including node status, labels, and InfiniBand topology details for Azure HPC/AI environments
Azure HPC/AI MCP Server
A minimal Model Context Protocol (MCP) server tailored for Azure HPC/AI clusters. It provides tools that query Kubernetes for GPU node information using kubectl
. The server is implemented with fastmcp
and uses synchronous subprocess calls (no asyncio).
Tools
- list_nodes: Lists nodes in the GPU pool with name, labels, and Ready/NotReady status.
- get_node_topologies: Returns InfiniBand-related topology labels per node: agentpool, pkey, torset.
Both tools shell out to kubectl
and return JSON-serializable Python structures (lists of dicts).
Run the server
Prerequisites:
- Python 3.10+
- kubectl configured to access your cluster
Installation
It’s recommended to use a virtual environment.
Create and activate a venv (Linux/macOS):
Install dependencies with pip:
Notes:
- fastmcp is required to run the server and is installed via
requirements.txt
. Tests don’t need it (they stub it). - If fastmcp isn’t on PyPI for your environment, install it from its source per its documentation.
Run:
The server runs over stdio for MCP hosts. You can connect to it with an MCP-compatible client or call the tools locally with the helper script below.
invoke_local helper
The invoke_local.py
script lets you execute server tools in-process without an MCP host. It discovers exported tools from server.py
, calls them synchronously, and prints pretty JSON.
Examples:
Implementation notes:
- No asyncio is used; tool functions call
subprocess.run
directly and return plain Python data. - The script unwraps simple function tools or FastMCP FunctionTool-like wrappers and invokes them with kwargs from
--params
when provided.
Tests
The tests are written with pytest
and exercise success and error paths without requiring a cluster.
Key points:
- subprocess-based: Tests monkeypatch
subprocess.run
to simulatekubectl
output and errors. There is no usage of asyncio in code or tests. - fastmcp-free: Tests inject a lightweight dummy
FastMCP
module so importingserver.py
does not require the real dependency. - Coverage: Both tools are validated for JSON parsing, Ready condition handling, missing labels, and
kubectl
failures.
Run tests:
Troubleshooting
- kubectl not found: Ensure
kubectl
is installed and on PATH for real runs. Tests do not require it. - No nodes returned: Confirm your label selectors match your cluster (tools currently expect GPU/IB labels used in Azure HPC/AI pools).
- fastmcp import error: Install
fastmcp
for runtime; tests provide a dummy stub so you can runpytest
without it.
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables users to query Azure HPC/AI Kubernetes clusters for GPU node information and InfiniBand topology details through kubectl commands. Provides tools to list GPU pool nodes with their status and retrieve network topology labels for high-performance computing workloads.
Related MCP Servers
- -securityAlicense-qualityA read-only MCP server for Kubernetes that allows querying cluster information and diagnosing issues through natural language interfaces like Claude.Last updated -8MIT License
- -securityFlicense-qualityProvides tools for listing and querying Azure resources directly from any MCP client, allowing you to efficiently browse your Azure infrastructure and analyze costs without leaving your workflow.Last updated -2
- -securityAlicense-qualityEnables management of Azure Cloud PCs using the Microsoft Graph API, allowing users to list available Cloud PCs in their tenant through Claude Desktop.Last updated -MIT License
- -securityAlicense-qualityEnables interaction with Hyperbolic's GPU cloud, allowing users to view available GPUs, rent instances, establish SSH connections, and run GPU-powered workloads through natural language commands.Last updated -13MIT License