Enables routing and execution of hardware-related tasks on nodes equipped with Arduino connectivity and capabilities.
Automatically offloads and executes Docker container commands, builds, and management tasks across distributed cluster nodes.
Distributes Jest test suites across the cluster for parallel execution and improved testing throughput.
Provides distributed command execution and task routing specifically targeting x86_64 Linux nodes within the cluster.
Routes and executes commands across macOS-based nodes, supporting both Intel and ARM64 architectures for distributed tasks.
Automatically routes and executes Mocha-based test suites across optimal cluster nodes.
Offloads Node.js package management and build tasks to the most suitable nodes in the distributed network.
Routes LLM inference and model-serving workloads to dedicated inference nodes running Ollama.
Automatically routes pnpm build and installation commands to optimal nodes based on current cluster load.
Supports distributed execution of Podman container operations and builds across the node network.
Enables parallel and distributed execution of Python test suites via pytest across the agentic cluster.
Routes Swift compilation and build tasks to macOS nodes within the cluster.
Supports routing orchestration and coordination tasks to cluster nodes configured with Temporal.
Supports execution of commands and containerized workloads specifically targeting Ubuntu environments on Linux nodes.
Automatically offloads Yarn-based build processes and package management tasks to available cluster resources.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Cluster Execution MCP Serverrun my test suite in parallel across the cluster nodes"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Cluster Execution MCP Server
Cluster-aware command execution for distributed task routing across the AGI agentic cluster.
Version: 0.2.0
Features
Automatic task routing: Commands routed to optimal nodes based on load, capabilities, and requirements
Multi-node support: macpro51 (Linux x86_64), mac-studio (macOS ARM64), macbook-air (macOS ARM64), inference node
Dynamic IP resolution: mDNS, DNS, and fallback methods with caching
Security hardened: No shell injection, environment-based configuration, command validation
SSH connectivity verification: Retry logic with configurable timeouts
Parallel execution: Distribute commands across cluster for maximum throughput
Installation
cd /mnt/agentic-system/mcp-servers/cluster-execution-mcp
pip install -e .
# For development:
pip install -e ".[dev]"Configuration
Claude Code Configuration
Add to ~/.claude.json:
{
"mcpServers": {
"cluster-execution": {
"command": "/mnt/agentic-system/.venv/bin/python3",
"args": ["-m", "cluster_execution_mcp.server"]
}
}
}Environment Variables
All configuration is externalized via environment variables:
Variable | Default | Description |
|
| SSH username for remote execution |
|
| SSH connection timeout (seconds) |
|
| Initial SSH connect timeout (seconds) |
|
| Number of SSH retry attempts |
|
| CPU usage % threshold for offloading |
|
| Load average threshold for offloading |
|
| Memory usage % threshold for offloading |
|
| Command execution timeout (seconds) |
|
| Status check timeout (seconds) |
|
| IP resolution cache TTL (seconds) |
|
| Gateway IP for route detection |
|
| DNS server for IP detection |
|
| Base path for databases |
Node Configuration
Node hostnames and IPs can be customized:
Variable | Default | Description |
|
| Mac Pro hostname |
|
| Mac Pro fallback IP |
|
| Mac Studio hostname |
|
| Mac Studio fallback IP |
|
| MacBook Air hostname |
|
| MacBook Air fallback IP |
|
| Inference node hostname |
|
| Inference node fallback IP |
MCP Tools
Tool | Description |
| Execute bash commands with automatic cluster routing |
| Get current cluster state and load distribution |
| Explicitly route command to specific node |
| Run multiple commands in parallel across nodes |
Usage Examples
Automatic Routing
# Heavy commands auto-route to least loaded node
result = await cluster_bash("make -j8 all")
# Simple commands run locally
result = await cluster_bash("ls -la")Force Specific Requirements
# Force Linux execution
result = await cluster_bash("docker build .", requires_os="linux")
# Force x86_64 architecture
result = await cluster_bash("cargo build", requires_arch="x86_64")Explicit Node Routing
# Run on Linux builder
result = await offload_to("podman run -it ubuntu:22.04", node_id="macpro51")
# Run on Mac Studio
result = await offload_to("swift build", node_id="mac-studio")Parallel Execution
# Run tests across cluster
results = await parallel_execute([
"pytest tests/unit/",
"pytest tests/integration/",
"pytest tests/e2e/"
])Cluster Status
# Get cluster health before heavy operations
status = await cluster_status()
# Returns:
# {
# "local_node": "macpro51",
# "nodes": {
# "macpro51": {"cpu_percent": 15.2, "memory_percent": 45.3, ...},
# "mac-studio": {"cpu_percent": 8.1, "memory_percent": 32.1, ...},
# ...
# }
# }Cluster Nodes
Node | OS | Arch | Capabilities | Specialties |
| Linux | x86_64 | docker, podman, raid, nvme, compilation, testing, tpu | compilation, testing, containerization, benchmarking |
| macOS | ARM64 | orchestration, coordination, temporal, mlx-gpu, arduino | orchestration, coordination, monitoring |
| macOS | ARM64 | research, documentation, analysis | research, documentation, mobile |
| macOS | ARM64 | ollama, inference, model-serving, llm-api | ollama-inference, model-serving |
Offload Patterns
Commands matching these patterns are automatically offloaded:
Build:
make,cargo,npm,yarn,pnpmTest:
pytest,jest,mocha,testCompile:
gcc,g++,clangContainer:
docker,podman,kubectlFile ops:
rsync,scp,tar,zip,find,grep -r
Commands that stay local:
Simple:
ls,pwd,cd,echo,cat,head,tail,which,type
Security
Shell Injection Prevention
All commands use subprocess.run() with list arguments where possible:
# SAFE: List arguments
subprocess.run(["ssh", "-o", "ConnectTimeout=5", f"{user}@{ip}", command])
# Complex shell commands are validated before executionCommand Validation
Commands are validated for dangerous patterns:
rm -rf /rm -rf /*> /dev/sdaFork bombs
And more...
SSH Configuration
StrictHostKeyChecking=accept-new- Accept new hosts but verify returning hostsBatchMode=yes- Non-interactive mode for scriptingConfigurable timeouts and retries
Development
Running Tests
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# With coverage
pytest tests/ --cov=cluster_execution_mcp --cov-report=htmlProject Structure
cluster-execution-mcp/
├── src/cluster_execution_mcp/
│ ├── __init__.py # Package exports
│ ├── config.py # Configuration, validation, node definitions
│ ├── router.py # Task routing and IP resolution
│ └── server.py # FastMCP server and tools
├── tests/
│ ├── conftest.py # Pytest fixtures
│ ├── test_config.py # Config module tests (29 tests)
│ ├── test_router.py # Router module tests (21 tests)
│ └── test_server.py # Server and tool tests (21 tests)
└── pyproject.toml # Package configurationCLI Interface
# Submit a command
cluster-router submit "make -j8 all"
# Check task status
cluster-router status <task_id>
# Show cluster status
cluster-router cluster-statusMonitoring
Check cluster health before operations:
User: "Show me cluster status"
Claude Code: cluster_status tool
Output:
macpro51:
CPU: 45.2%
Memory: 18.3%
Load: 3.21
Status: healthy
mac-studio:
CPU: 22.1%
Memory: 54.7%
Load: 2.15
Status: healthy
macbook-air:
CPU: 12.8%
Memory: 38.2%
Load: 1.03
Status: healthyTroubleshooting
MCP server not loading:
# Check config
cat ~/.claude.json | jq '.mcpServers["cluster-execution"]'
# Test server import
python3 -c "from cluster_execution_mcp.server import main; print('OK')"Node unreachable:
# Test SSH connectivity
ssh marc@macpro51.local hostname
ssh marc@Marcs-Mac-Studio.local hostname
# Check with fallback IP
ssh marc@192.168.1.183 hostnameCommands timing out:
# Increase timeout via environment
export CLUSTER_CMD_TIMEOUT=600 # 10 minutes
export CLUSTER_SSH_TIMEOUT=10 # 10 secondsChangelog
v0.2.0
New Features:
Proper package structure with pyproject.toml
Environment-based configuration (no hardcoded credentials)
Shared config module with validation functions
Retry logic for SSH connectivity
IP resolution caching with TTL
Inference node support
Security Improvements:
Eliminated shell injection vulnerabilities
Command validation for dangerous patterns
IP validation rejecting loopback/Docker/link-local
SSH host key handling (accept-new)
Code Quality:
Full type hints throughout codebase
Replaced bare except clauses with specific exceptions
Added comprehensive logging
71 unit tests with mocking
Bug Fixes:
Fixed darwin/macos OS alias handling
Proper timeout handling in SSH operations
Better error messages for failed operations
v0.1.0
Initial release with basic cluster execution
License
MIT
Part of the AGI Agentic System
See also:
Node Chat MCP - Inter-node communication
Enhanced Memory MCP - Persistent memory with RAG
Agent Runtime MCP - Goals and task queue
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.