Skip to main content
Glama

MCP vLLM Benchmarking Tool

by Eliovp-BV
README.md1.05 kB
# MCP vLLM Benchmarking Tool This is proof of concept on how to use MCP to interactively benchmark vLLM. We are not new to benchmarking, read our blog: [Benchmarking vLLM](https://eliovp.com/introducing-our-benchmarking-tool-powered-by-dstack/) This is just an exploration of possibilities with MCP. ## Usage 1. Clone the repository 2. Add it to your MCP servers: ``` { "mcpServers": { "mcp-vllm": { "command": "uv", "args": [ "run", "/Path/TO/mcp-vllm-benchmarking-tool/server.py" ] } } } ``` Then you can prompt for example like this: ``` Do a vllm benchmark for this endpoint: http://10.0.101.39:8888 benchmark the following model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B run the benchmark 3 times with each 32 num prompts, then compare the results, but ignore the first iteration as that is just a warmup. ``` ## Todo: - Due to some random outputs by vllm it may show that it found some invalid json. I have not really looked into it yet.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Eliovp-BV/mcp-vllm-benchmark'

If you have feedback or need assistance with the MCP directory API, please join our Discord server