Skip to main content
Glama
marc-shade
by marc-shade

Cluster Execution MCP Server

Cluster-aware command execution for distributed task routing across the AGI agentic cluster.

Version: 0.2.0

Features

  • Automatic task routing: Commands routed to optimal nodes based on load, capabilities, and requirements

  • Multi-node support: macpro51 (Linux x86_64), mac-studio (macOS ARM64), macbook-air (macOS ARM64), inference node

  • Dynamic IP resolution: mDNS, DNS, and fallback methods with caching

  • Security hardened: No shell injection, environment-based configuration, command validation

  • SSH connectivity verification: Retry logic with configurable timeouts

  • Parallel execution: Distribute commands across cluster for maximum throughput

Installation

cd /mnt/agentic-system/mcp-servers/cluster-execution-mcp pip install -e . # For development: pip install -e ".[dev]"

Configuration

Claude Code Configuration

Add to ~/.claude.json:

{ "mcpServers": { "cluster-execution": { "command": "/mnt/agentic-system/.venv/bin/python3", "args": ["-m", "cluster_execution_mcp.server"] } } }

Environment Variables

All configuration is externalized via environment variables:

Variable

Default

Description

CLUSTER_SSH_USER

marc

SSH username for remote execution

CLUSTER_SSH_TIMEOUT

5

SSH connection timeout (seconds)

CLUSTER_SSH_CONNECT_TIMEOUT

2

Initial SSH connect timeout (seconds)

CLUSTER_SSH_RETRIES

2

Number of SSH retry attempts

CLUSTER_CPU_THRESHOLD

40

CPU usage % threshold for offloading

CLUSTER_LOAD_THRESHOLD

4

Load average threshold for offloading

CLUSTER_MEMORY_THRESHOLD

80

Memory usage % threshold for offloading

CLUSTER_CMD_TIMEOUT

300

Command execution timeout (seconds)

CLUSTER_STATUS_TIMEOUT

5

Status check timeout (seconds)

CLUSTER_IP_CACHE_TTL

300

IP resolution cache TTL (seconds)

CLUSTER_GATEWAY

192.168.1.1

Gateway IP for route detection

CLUSTER_DNS

8.8.8.8

DNS server for IP detection

AGENTIC_SYSTEM_PATH

/mnt/agentic-system

Base path for databases

Node Configuration

Node hostnames and IPs can be customized:

Variable

Default

Description

CLUSTER_MACPRO51_HOST

macpro51.local

Mac Pro hostname

CLUSTER_MACPRO51_IP

192.168.1.183

Mac Pro fallback IP

CLUSTER_MACSTUDIO_HOST

Marcs-Mac-Studio.local

Mac Studio hostname

CLUSTER_MACSTUDIO_IP

192.168.1.16

Mac Studio fallback IP

CLUSTER_MACBOOKAIR_HOST

Marcs-MacBook-Air.local

MacBook Air hostname

CLUSTER_MACBOOKAIR_IP

192.168.1.172

MacBook Air fallback IP

CLUSTER_INFERENCE_HOST

completeu-server.local

Inference node hostname

CLUSTER_INFERENCE_IP

192.168.1.186

Inference node fallback IP

MCP Tools

Tool

Description

cluster_bash

Execute bash commands with automatic cluster routing

cluster_status

Get current cluster state and load distribution

offload_to

Explicitly route command to specific node

parallel_execute

Run multiple commands in parallel across nodes

Usage Examples

Automatic Routing

# Heavy commands auto-route to least loaded node result = await cluster_bash("make -j8 all") # Simple commands run locally result = await cluster_bash("ls -la")

Force Specific Requirements

# Force Linux execution result = await cluster_bash("docker build .", requires_os="linux") # Force x86_64 architecture result = await cluster_bash("cargo build", requires_arch="x86_64")

Explicit Node Routing

# Run on Linux builder result = await offload_to("podman run -it ubuntu:22.04", node_id="macpro51") # Run on Mac Studio result = await offload_to("swift build", node_id="mac-studio")

Parallel Execution

# Run tests across cluster results = await parallel_execute([ "pytest tests/unit/", "pytest tests/integration/", "pytest tests/e2e/" ])

Cluster Status

# Get cluster health before heavy operations status = await cluster_status() # Returns: # { # "local_node": "macpro51", # "nodes": { # "macpro51": {"cpu_percent": 15.2, "memory_percent": 45.3, ...}, # "mac-studio": {"cpu_percent": 8.1, "memory_percent": 32.1, ...}, # ... # } # }

Cluster Nodes

Node

OS

Arch

Capabilities

Specialties

macpro51

Linux

x86_64

docker, podman, raid, nvme, compilation, testing, tpu

compilation, testing, containerization, benchmarking

mac-studio

macOS

ARM64

orchestration, coordination, temporal, mlx-gpu, arduino

orchestration, coordination, monitoring

macbook-air

macOS

ARM64

research, documentation, analysis

research, documentation, mobile

inference

macOS

ARM64

ollama, inference, model-serving, llm-api

ollama-inference, model-serving

Offload Patterns

Commands matching these patterns are automatically offloaded:

  • Build: make, cargo, npm, yarn, pnpm

  • Test: pytest, jest, mocha, test

  • Compile: gcc, g++, clang

  • Container: docker, podman, kubectl

  • File ops: rsync, scp, tar, zip, find, grep -r

Commands that stay local:

  • Simple: ls, pwd, cd, echo, cat, head, tail, which, type

Security

Shell Injection Prevention

All commands use subprocess.run() with list arguments where possible:

# SAFE: List arguments subprocess.run(["ssh", "-o", "ConnectTimeout=5", f"{user}@{ip}", command]) # Complex shell commands are validated before execution

Command Validation

Commands are validated for dangerous patterns:

  • rm -rf /

  • rm -rf /*

  • > /dev/sda

  • Fork bombs

  • And more...

SSH Configuration

  • StrictHostKeyChecking=accept-new - Accept new hosts but verify returning hosts

  • BatchMode=yes - Non-interactive mode for scripting

  • Configurable timeouts and retries

Development

Running Tests

# Install dev dependencies pip install -e ".[dev]" # Run tests pytest tests/ -v # With coverage pytest tests/ --cov=cluster_execution_mcp --cov-report=html

Project Structure

cluster-execution-mcp/ ├── src/cluster_execution_mcp/ │ ├── __init__.py # Package exports │ ├── config.py # Configuration, validation, node definitions │ ├── router.py # Task routing and IP resolution │ └── server.py # FastMCP server and tools ├── tests/ │ ├── conftest.py # Pytest fixtures │ ├── test_config.py # Config module tests (29 tests) │ ├── test_router.py # Router module tests (21 tests) │ └── test_server.py # Server and tool tests (21 tests) └── pyproject.toml # Package configuration

CLI Interface

# Submit a command cluster-router submit "make -j8 all" # Check task status cluster-router status <task_id> # Show cluster status cluster-router cluster-status

Monitoring

Check cluster health before operations:

User: "Show me cluster status" Claude Code: cluster_status tool Output: macpro51: CPU: 45.2% Memory: 18.3% Load: 3.21 Status: healthy mac-studio: CPU: 22.1% Memory: 54.7% Load: 2.15 Status: healthy macbook-air: CPU: 12.8% Memory: 38.2% Load: 1.03 Status: healthy

Troubleshooting

MCP server not loading:

# Check config cat ~/.claude.json | jq '.mcpServers["cluster-execution"]' # Test server import python3 -c "from cluster_execution_mcp.server import main; print('OK')"

Node unreachable:

# Test SSH connectivity ssh marc@macpro51.local hostname ssh marc@Marcs-Mac-Studio.local hostname # Check with fallback IP ssh marc@192.168.1.183 hostname

Commands timing out:

# Increase timeout via environment export CLUSTER_CMD_TIMEOUT=600 # 10 minutes export CLUSTER_SSH_TIMEOUT=10 # 10 seconds

Changelog

v0.2.0

  • New Features:

    • Proper package structure with pyproject.toml

    • Environment-based configuration (no hardcoded credentials)

    • Shared config module with validation functions

    • Retry logic for SSH connectivity

    • IP resolution caching with TTL

    • Inference node support

  • Security Improvements:

    • Eliminated shell injection vulnerabilities

    • Command validation for dangerous patterns

    • IP validation rejecting loopback/Docker/link-local

    • SSH host key handling (accept-new)

  • Code Quality:

    • Full type hints throughout codebase

    • Replaced bare except clauses with specific exceptions

    • Added comprehensive logging

    • 71 unit tests with mocking

  • Bug Fixes:

    • Fixed darwin/macos OS alias handling

    • Proper timeout handling in SSH operations

    • Better error messages for failed operations

v0.1.0

  • Initial release with basic cluster execution

License

MIT


Part of the AGI Agentic System

See also:

  • Node Chat MCP - Inter-node communication

  • Enhanced Memory MCP - Persistent memory with RAG

  • Agent Runtime MCP - Goals and task queue

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marc-shade/cluster-execution-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server