Skip to main content
Glama

InfraPilot ๐Ÿ›ฐ๏ธ

Agentic AI for infrastructure operations. A multi-agent system that provisions, monitors, validates compliance and auto-remediates cloud, network and security infrastructure โ€” built on Python, MCP, CrewAI, Terraform and Ansible.

CI Python License: MIT IaC

InfraPilot closes the full ops loop end-to-end: provision โ†’ configure โ†’ observe โ†’ audit โ†’ remediate โ†’ re-audit, coordinated by a crew of specialised AI agents โ€” and it runs out of the box with no cloud account, no API key and no Terraform/Ansible binaries required (it transparently simulates execution when a binary is absent).


Why it exists

Most "AI for DevOps" demos stop at a chatbot that writes a Terraform snippet. InfraPilot models the operational loop an automation engineer actually owns: turning declarative intent into running infrastructure, watching it, proving it meets security/governance policy, and fixing drift automatically through code โ€” with every action typed, reported and auditable.

Related MCP server: MCP Cloud Services Server

Architecture

flowchart LR
    subgraph Crew["Agent crew"]
        P[Provisioner] --> C[Configurator] --> O[Observer] --> A[Compliance Auditor] --> R[Remediator]
    end
    R -- re-audit --> A

    subgraph Tools["Shared tools"]
        TF[Terraform tool]
        AN[Ansible tool]
        MON[Monitoring tool]
        POL[Policy-as-code engine]
        REM[Remediation strategies]
    end

    P --> TF
    C --> AN
    O --> MON
    A --> POL
    R --> REM

    Engines["Engines: native ยท CrewAI"] --- Crew
    MCP["MCP server"] --- Tools
    LLM["Anthropic Claude (optional)"] -.reasoning.- Crew
  • Two interchangeable engines. native (zero heavy deps, drives the loop deterministically, used in CI) and crewai (maps the same crew onto a CrewAI Crew with an LLM). Swap with --engine.

  • Tools are the source of truth. Terraform, Ansible, monitoring, policy and remediation logic live in infrapilot/tools/ and are shared by every engine and the MCP server โ€” so there is one implementation, three ways to drive it.

  • MCP-native. infrapilot/mcp_server/ exposes the tools over the Model Context Protocol, so Claude Desktop / Claude Code / any MCP client can run infra operations through natural language.

  • LLM optional. With ANTHROPIC_API_KEY set, agents use Claude to triage anomalies and justify remediations. Without it, everything still runs.

Quickstart

git clone https://github.com/Gsfrota/infra-pilot && cd infra-pilot
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

infrapilot demo          # fully simulated end-to-end run โ€” no creds needed

Example output (abridged):

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ InfraPilot run โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ engine=native  llm=off  compliance score=100.0/100             โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
 provision   ok     4 resources provisioned (simulated)
 configure   ok     configuration applied (simulated)
 observe     warn   3 anomalies detected
 audit       error  3 violations, score 43.8
 remediate   ok     3 fixes applied, score 43.8 -> 100.0

Commands

Command

What it does

infrapilot demo

Self-contained simulated run (no cloud/API key/binaries).

infrapilot run

Full loop; uses real terraform/ansible if installed.

infrapilot run --no-remediate

Audit + propose fixes without applying.

infrapilot run --engine crewai

Drive the crew with CrewAI + Claude.

infrapilot audit

Compliance gate โ€” exits non-zero on any violation (great in CI).

Use it from Claude (MCP)

pip install -e ".[mcp]"
infrapilot-mcp            # serves the tools over MCP (stdio)
// claude_desktop_config.json
{
  "mcpServers": {
    "infrapilot": { "command": "infrapilot-mcp" }
  }
}

Then ask Claude: "Provision the infra, audit it for security issues, and remediate anything critical."

How the loop works

  1. Provision โ€” TerraformTool applies infra/desired_state.yaml (real terraform apply against the local/null/random providers when the binary is present; simulated otherwise).

  2. Configure โ€” AnsibleTool converges host configuration via a playbook.

  3. Observe โ€” MonitoringTool ingests a Prometheus-style telemetry snapshot and triages anomalies against thresholds.

  4. Audit โ€” the policy-as-code engine evaluates every resource against policies/policies.yaml; new governance rules are added in YAML, not code.

  5. Remediate โ€” RemediationTool maps each violation to a least-privilege fix and applies it through the right IaC backend (Terraform or Ansible).

  6. Re-audit โ€” the loop re-scores compliance to prove the drift is closed.

Policy-as-code

- id: SEC-001
  name: "No SSH open to the internet"
  severity: critical
  resource_type: security_group
  rule: no_ingress_cidr
  params: { port: 22, forbidden_cidr: "0.0.0.0/0" }
  remediation: restrict_sg_ingress

Built-in rules: required_tag, no_ingress_cidr, attribute_equals, attribute_max. Built-in remediations: add_tag, restrict_sg_ingress, enable_encryption, restart_service.

Project layout

infrapilot/
โ”œโ”€โ”€ agents/        # role/goal/backstory crew (engine-agnostic)
โ”œโ”€โ”€ engines/       # native + crewai orchestrators
โ”œโ”€โ”€ tools/         # terraform ยท ansible ยท monitoring ยท compliance ยท remediation
โ”œโ”€โ”€ mcp_server/    # MCP server exposing the tools
โ”œโ”€โ”€ llm.py         # optional Anthropic reasoning layer
โ”œโ”€โ”€ reporting.py   # rich console + JSON/Markdown artifacts
โ””โ”€โ”€ cli.py         # typer CLI
infra/             # terraform/, ansible/, observability/, desired_state.yaml
policies/          # policy-as-code
tests/             # pytest suite (engine, compliance, monitoring, remediation)

Development

pip install -e ".[dev]"
ruff check .          # lint
pytest                # tests
infrapilot demo       # smoke test the full loop

CI (GitHub Actions) runs ruff + pytest on 3.10/3.11/3.12 and additionally installs real Terraform and Ansible to validate/lint the IaC.

Roadmap

  • Real cloud providers behind a feature flag (AWS/GCP modules)

  • LangChain tool adapter alongside CrewAI

  • Drift detection on a schedule (cron / GitHub Actions)

  • OPA/Rego policy backend option

License

MIT โ€” see LICENSE.


Built by Guilherme Frota Souza โ€” automation & AI engineer.

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
โ€“Release cycle
โ€“Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Gsfrota/infra-pilot'

If you have feedback or need assistance with the MCP directory API, please join our Discord server