Which integrations are available for this server?

Provides tools for benchmarking and running LLM inference on Arm64 cloud instances, measuring performance metrics like tokens/sec and memory usage, and serving results via an MCP-compatible API.

How do I use ArmBench MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@ArmBench MCP Server run a benchmark on Llama-3.2-3B with Q4_K_M" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

ArmBench MCP Server

by sirmos

Overview Schema Related Servers Score Discussions

Python

Remote

title: Arm Pulse emoji: ⚡ colorFrom: blue colorTo: indigo sdk: docker pinned: false

⚡ Arm Pulse - Arm64 LLM Inference Benchmark Suite + MCP Server

KleidiAI-optimized LLM benchmarking and inference server for Arm64 cloud infrastructure. Built for the Arm AI Optimization Challenge 2026.

License Platform Python

Related MCP server: parallelix-mcp

🎯 What is Arm Pulse?

Arm Pulse is a one-command benchmarking tool that:

Deploys LLMs (Llama 3.2) on Arm64 cloud instances using llama.cpp + KleidiAI
Measures real performance tokens/sec, time-to-first-token, memory usage across quantization levels (Q4_K_M vs Q8_0)
Serves results via an MCP-compatible FastAPI server any agent framework can call
Visualizes everything in a clean real time dashboard

🏗️ Architecture

arm-pulse/

├── benchmark/ # llama.cpp + KleidiAI inference engine + metrics

├── mcp_server/ # FastAPI MCP-compatible LLM endpoint

├── dashboard/ # Real-time results dashboard (HTML)

├── scripts/ # One-command setup + benchmark + server scripts

└── docker/ # Arm64-optimized Docker configuration

🚀 Quick Start (Arm64 Instance)

1. Clone and setup

git clone https://github.com/sirmos/arm-pulse.git
cd arm-pulse
bash scripts/setup.sh

2. Run benchmark

bash scripts/run_benchmark.sh

3. Start MCP server

bash scripts/start_mcp.sh

4. Open dashboard

Navigate to http://your-instance-ip:8000 in your browser.

☁️ Tested Arm64 Platforms

Platform	Instance	Arm CPU
Oracle Cloud	VM.Standard.A1.Flex	Ampere Altra
AWS	c7g.large	Graviton3
GCP	c4a-standard-4	Axion

📊 What We Benchmark

Metric	Description
Tokens/sec	Inference throughput
Time to First Token	Latency from prompt to first output token
Memory (MB)	RAM consumed during inference
Model size (GB)	Disk footprint per quantization level

Models

Model	Quant	Size	Use case
Llama-3.2-3B-Instruct	Q4_K_M	1.9 GB	Speed-optimized
Llama-3.2-3B-Instruct	Q8_0	3.4 GB	Quality-optimized

🔌 MCP Server API

Endpoint	Method	Description
`/`	GET	Server info
`/health`	GET	Health + platform info
`/models`	GET	List available models
`/generate`	POST	Run inference
`/benchmark`	POST	Full benchmark suite
`/mcp/tools`	GET	MCP-compatible tools listing
`/docs`	GET	Interactive API docs

Example: Generate

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is KleidiAI?", "model": "Llama-3.2-3B-Q4_K_M"}'

⚙️ Arm-Specific Optimizations

KleidiAI: Arm's optimized kernel library for ML workloads
llama.cpp Arm SVE: Scalable Vector Extension support enabled at build time
Native CPU tuning: -DLLAMA_NATIVE=ON compiles for exact CPU microarchitecture
Thread optimization: Automatically uses all available Arm cores

📄 License

MIT License - see LICENSE

Built for the Arm AI Optimization Challenge 2026

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sirmos/arm-pulse'

If you have feedback or need assistance with the MCP directory API, please join our Discord server