CtrlTest MCP Server
The CtrlTest MCP Server provides automated control system evaluation and regression testing for PID and bio-inspired controllers in flapping-wing aircraft systems, exposing a standardized API (ctrltest.analyze_pid) for comprehensive performance analysis.
Core Capabilities:
PID Controller Analysis - Evaluate PID gains against plant dynamics (natural frequency, damping ratio) to measure overshoot, settling time, integral squared error (ISE), and Lyapunov stability margins
High-Fidelity Simulation Integration - Ingest data from diffSPH (differentiable smoothed-particle hydrodynamics) and Foam-Agent CFD simulations via
extra_metricsfor realistic performance assessmentMulti-Modal Scoring - Fuse analytic models with high-fidelity simulation data when both diffSPH and Foam-Agent metrics are available
Gust Rejection Analysis - Quantify disturbance handling through gust detection latency, bandwidth, and rejection percentage metrics
Energy Efficiency Assessment - Measure CPG (Central Pattern Generator) baseline vs. consumed energy and calculate energy reduction percentages
Mix-of-Experts Evaluation - Compute switching penalties, latency budgets, and energy consumption for adaptive control architectures
Automated Integration - Support continuous performance monitoring, automated gain tuning, and controller optimization through STDIO/HTTP API integration with ToolHive, RL algorithms, and evolutionary systems
Structured Output - Export JSON-formatted results with provenance metadata for dashboards, logging, and downstream analysis
Provides a REST API interface for the control system evaluation service, allowing HTTP-based access to controller benchmarking and scoring functionality.
Enables visualization of controller performance metrics and analytics data exported from PID evaluations and comparative controller assessments.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@CtrlTest MCP Serverevaluate PID gains kp=2.0, ki=0.5, kd=0.12 for a second-order plant with natural frequency 3.2Hz"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ctrltest-mcp - Flight-control regression lab for MCP agents
TL;DR: Evaluate PID and bio-inspired controllers against analytic or diffSPH/Foam-Agent data through MCP, logging overshoot, energy, and gust metrics automatically.
Table of contents
What it provides
Scenario | Value |
Analytic PID benchmarking | Run closed-form plant models and produce overshoot/settling/energy metrics without manual scripting. |
High-fidelity scoring | Ingest logged data from Foam-Agent or diffSPH runs and fuse it into controller evaluations. |
MCP integration | Expose the scoring API via STDIO/HTTP so ToolHive or other clients can automate gain tuning and generate continuous performance scorecards. |
Quickstart
uv pip install "git+https://github.com/Three-Little-Birds/ctrltest-mcp.git"Run a PID evaluation:
from ctrltest_mcp import (
ControlAnalysisInput,
ControlPlant,
ControlSimulation,
PIDGains,
evaluate_control,
)
request = ControlAnalysisInput(
plant=ControlPlant(natural_frequency_hz=3.2, damping_ratio=0.35),
gains=PIDGains(kp=2.0, ki=0.5, kd=0.12),
simulation=ControlSimulation(duration_s=3.0, sample_points=400),
setpoint=0.2,
)
response = evaluate_control(request)
print(response.model_dump())Typical outputs (analytic only):
{
"overshoot": -0.034024863556091134,
"ise": 0.008612387509182674,
"settling_time": 3.0,
"gust_detection_latency_ms": 0.8,
"gust_detection_bandwidth_hz": 1200.0,
"gust_rejection_pct": 0.396,
"cpg_energy_baseline_j": 12.0,
"cpg_energy_consumed_j": 7.8,
"cpg_energy_reduction_pct": 0.35,
"lyapunov_margin": 0.12,
"moe_switch_penalty": 0.135,
"moe_latency_ms": 12.72,
"moe_energy_j": 3.9,
"multi_modal_score": null,
"extra_metrics": null,
"metadata": {"solver": "analytic"}
}The analytic plant example above clips
settling_timeat the requested simulation duration (duration_s=3.0). Increase the horizon if you need the loop to settle fully before computing that metric.
Run as a service
CLI (STDIO transport)
uvx ctrltest-mcp # runs the MCP over stdio
# or just python -m ctrltest_mcpUse python -m ctrltest_mcp --describe to print basic metadata without starting the server.
FastAPI (REST)
uv run uvicorn ctrltest_mcp.fastapi_app:create_app --factory --port 8005python-sdk tool (STDIO / MCP)
from mcp.server.fastmcp import FastMCP
from ctrltest_mcp.tool import build_tool
mcp = FastMCP("ctrltest-mcp", "Flapping-wing control regression")
build_tool(mcp)
if __name__ == "__main__":
mcp.run()ToolHive smoke test
Run the integration script from your workspace root:
uvx --with 'mcp==1.20.0' python scripts/integration/run_ctrltest.pyThe smoke test runs the analytic path by default. To exercise high-fidelity scoring, stage Foam-Agent archives under logs/foam_agent/ and diffSPH gradients under logs/diffsph/ before launching the script.
Agent playbook
Gust rejection - feed archived diffSPH gradients (
diffsph_metrics) and Foam-Agent archives (paths returned by those services) to quantify adaptive CPG improvements.Controller comparison - log analytics for multiple PID gains, export JSONL evidence, and visualise in Grafana.
Policy evaluation - integrate with RL or evolutionary algorithms; metrics are structured for automated scoring.
Stretch ideas
Extend the adapter for PteraControls (planned once upstream Python bindings are published).
Drive the MCP from
scripts/fitnessto populate nightly scorecards.Combine with
migration-mcpto explore route-specific disturbance budgets.
Accessibility & upkeep
Run
uv run pytest(tests mock diffSPH/Foam-Agent inputs and assert deterministic analytic results).Keep metric schema changes documented—downstream dashboards rely on them.
Metric schema at a glance
Field | Units | Notes |
| radians | peak response minus setpoint |
| rad²·s | integral squared error |
| seconds | first time error stays within tolerance |
| milliseconds | detector latency |
| hertz | detector bandwidth |
| 0–1 | fraction of disturbance rejected |
| joules | energy pre-adaptation |
| joules | energy post-adaptation |
| 0–1 | energy reduction ratio |
| unitless | stability margin |
| unitless | cost weight × switches |
| milliseconds | latency budget after switching |
| joules | mix-of-experts energy draw |
| unitless | only when both diffSPH & Foam metrics are present |
| varies | raw diffSPH/Foam metrics merged in |
Example of fused high-fidelity metrics:
{
"extra_metrics": {
"force_gradient_norm": 0.87,
"lift_drag_ratio": 18.4
},
"multi_modal_score": 0.047,
"metadata": {"solver": "analytic"}
}Contributing
uv pip install --system -e .[dev]uv run ruff check .anduv run pytestShare sample metrics in PRs so reviewers can sanity-check improvements quickly.
MIT license - see LICENSE.
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Tools
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Three-Little-Birds/ctrltest-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server