Skip to main content
Glama

MCP Server - DevOps Automation Platform

A comprehensive Model Context Protocol (MCP) server for DevOps automation with real Docker, Kubernetes, and AWS integrations, API authentication, Prometheus/Grafana monitoring, and LLM-powered natural language automation.

Features

  • Real DevOps Integrations

    • Docker container and image management

    • Kubernetes cluster operations (deployments, pods, services)

    • AWS EC2, S3, ECS, Lambda, RDS management

  • Security

    • API key authentication

    • JWT token support

    • Rate limiting

  • Monitoring

    • Prometheus metrics

    • Grafana dashboards

    • Health checks

    • Alertmanager integration

  • LLM Integration

    • Ollama (local, free) - default

    • OpenAI API support

    • Natural language task automation

Quick Start

Option 1: Local Development

# Install dependencies
pip install -r requirements.txt

# Run the server
make dev
# or: uvicorn src.main_v2:app --reload --host 0.0.0.0 --port 8000

# Access the API docs
open http://localhost:8000/docs

Option 2: Docker Compose (with Monitoring)

# Copy environment template
cp .env.example .env
# Edit .env with your credentials

# Start all services
make up
# or: docker-compose up -d

# Access services:
# - MCP Server: http://localhost:8000
# - Prometheus: http://localhost:9090
# - Grafana: http://localhost:3000 (admin/admin)
# - Alertmanager: http://localhost:9093

API Endpoints

Public Endpoints

Endpoint

Method

Description

/

GET

Server info

/health

GET

Health check

/metrics

GET

Prometheus metrics

/docs

GET

API documentation

Authentication

Endpoint

Method

Description

/auth/api-key

POST

Generate API key

/auth/token

POST

Generate JWT token

Docker Operations

Endpoint

Method

Description

/docker/containers

GET

List containers

/docker/containers/{action}

POST

Start/stop/restart container

/docker/containers/{id}/logs

GET

Get container logs

/docker/containers/{id}/stats

GET

Get container stats

/docker/images

GET

List images

/docker/images/pull

POST

Pull an image

/docker/system/info

GET

Docker system info

/docker/system/prune

POST

Cleanup unused resources

Kubernetes Operations

Endpoint

Method

Description

/k8s/cluster/info

GET

Cluster info

/k8s/nodes

GET

List nodes

/k8s/namespaces

GET

List namespaces

/k8s/pods

GET

List pods

/k8s/pods/{pod}/logs

GET

Get pod logs

/k8s/deployments

GET

List deployments

/k8s/deployments/scale

POST

Scale deployment

/k8s/deployments/restart

POST

Restart deployment

/k8s/deployments/rollback

POST

Rollback deployment

/k8s/deployments/image

POST

Update image

/k8s/services

GET

List services

AWS Operations

Endpoint

Method

Description

/aws/ec2/instances

GET

List EC2 instances

/aws/ec2/{action}

POST

Start/stop/reboot EC2

/aws/s3/buckets

GET

List S3 buckets

/aws/s3/buckets/{bucket}/objects

GET

List objects

/aws/ecs/clusters

GET

List ECS clusters

/aws/lambda/functions

GET

List Lambda functions

/aws/lambda/invoke

POST

Invoke Lambda

/aws/rds/instances

GET

List RDS instances

Task Automation

Endpoint

Method

Description

/tasks/run

POST

Run automation task (orchestrated/queue-friendly path, including tailor_resume)

/resume/tailor

POST

Tailor a base resume to a specific mission/job opening (direct synchronous path)

/resume/tailor/upload

POST

Tailor from uploaded PDF/CSV/TXT/MD and export as JSON/PDF/CSV/Markdown/DOCX

/resume/exports/cleanup

POST

Delete generated resume export file from temporary storage

/deploy

POST

Deploy service

/cicd/trigger

POST

Trigger CI/CD pipeline

/logs

GET

Get logs

/alerts

POST

Send alert

/alerts/history

GET

Alert history

Resume Tailoring Example

Use this endpoint to adapt a base resume to a specific mission or job opening.

Request Body

{
  "base_resume": "# John Doe\n## Experience\n- Built CI/CD pipelines with GitHub Actions and Jenkins\n- Managed Kubernetes workloads on AWS EKS\n- Improved API latency by 35% in Python services",
  "job_description": "We need a DevOps engineer with Kubernetes, AWS, CI/CD, Python, observability, and production reliability experience.",
  "target_role": "Senior DevOps Engineer"
}

cURL (ready to run)

curl -X POST "http://localhost:8000/resume/tailor" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "base_resume": "# John Doe\n## Experience\n- Built CI/CD pipelines with GitHub Actions and Jenkins\n- Managed Kubernetes workloads on AWS EKS\n- Improved API latency by 35% in Python services",
    "job_description": "We need a DevOps engineer with Kubernetes, AWS, CI/CD, Python, observability, and production reliability experience.",
    "target_role": "Senior DevOps Engineer"
  }'

cURL via Task Runner (ready to run)

curl -X POST "http://localhost:8000/tasks/run" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "task_name": "tailor_resume",
    "parameters": {
      "base_resume": "# John Doe\n## Experience\n- Built CI/CD pipelines with GitHub Actions and Jenkins\n- Managed Kubernetes workloads on AWS EKS\n- Improved API latency by 35% in Python services",
      "job_description": "We need a DevOps engineer with Kubernetes, AWS, CI/CD, Python, observability, and production reliability experience.",
      "target_role": "Senior DevOps Engineer"
    }
  }'

Task runner with file inputs (PDF/CSV/TXT/MD paths on server):

curl -X POST "http://localhost:8000/tasks/run" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "task_name": "tailor_resume",
    "parameters": {
      "base_resume_path": "/data/incoming/my-resume.pdf",
      "job_description_path": "/data/incoming/job-offer.csv",
      "target_role": "Senior Platform Engineer",
      "output_format": "docx",
      "docx_style": "resumeio_inspired",
      "use_v6_background": true
    }
  }'

When output_format is pdf, csv, markdown, or docx, /tasks/run returns an output_file path in the task result. Supported docx_style values: minimal_executive, modern_creative, online_clean, visualcv_inspired, resumeio_inspired, resumebuilder_inspired. Optional use_v6_background toggle (default false) applies the V6 soft-blue background and left accent band to any DOCX style.

Example Response

{
  "success": true,
  "tailored_resume_markdown": "## Tailored Summary\nEngineer aligned to Senior DevOps Engineer, with experience evidenced in kubernetes, aws, ci, cd, python.\n\n## Highlighted Experience\n- Managed Kubernetes workloads on AWS EKS\n- Built CI/CD pipelines with GitHub Actions and Jenkins\n- Improved API latency by 35% in Python services",
  "keyword_alignment": {
    "matched": ["kubernetes", "aws", "python", "ci", "cd"],
    "partial": ["observability"],
    "missing": ["reliability"]
  },
  "gap_report": [
    "Missing or weak evidence for: reliability"
  ]
}

Which path should I use?

  • Use /resume/tailor for a direct, synchronous API call when you only need resume tailoring.

  • Use /tasks/run with task_name: tailor_resume when you want to route through the task orchestration flow (queueing, unified task handling, and automation pipelines).

File Upload + File Export

This endpoint accepts PDF, CSV, TXT, or MD files and can return output as json, pdf, csv, markdown, or docx.

curl -X POST "http://localhost:8000/resume/tailor/upload" \
  -H "X-API-Key: your-api-key" \
  -F "base_resume_file=@./my-resume.pdf" \
  -F "job_description_file=@./job-offer.csv" \
  -F "target_role=Senior Platform Engineer" \
  -F "output_format=docx" \
  -F "docx_style=resumebuilder_inspired" \
  -F "use_v6_background=true" \
  --output tailored_resume.docx

CSV export example:

curl -X POST "http://localhost:8000/resume/tailor/upload" \
  -H "X-API-Key: your-api-key" \
  -F "base_resume_file=@./my-resume.pdf" \
  -F "job_description_file=@./job-offer.txt" \
  -F "output_format=csv" \
  --output tailored_resume.csv

Cleanup generated export file (using output_file returned by task/upload flows):

curl -X POST "http://localhost:8000/resume/exports/cleanup" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "output_file": "/tmp/tmpabcd1234.pdf"
  }'

Automatic TTL cleanup is enabled for generated exports:

  • Job name: resume_export_ttl_cleanup

  • Default schedule: every hour (0 * * * *)

  • Default TTL: 24 hours

  • Only generated files with prefix resume_tailor_ and extensions .pdf, .csv, .md are deleted

Configuration via environment variables:

MCP_RESUME_EXPORT_TTL_HOURS=24
MCP_RESUME_EXPORT_CLEANUP_CRON="0 * * * *"

Authentication

Using API Key

# Generate an API key
curl -X POST "http://localhost:8000/auth/api-key?name=mykey"

# Use the key in requests
curl -H "X-API-Key: your-api-key" http://localhost:8000/docker/containers

Using JWT Token

# Get a token
curl -X POST "http://localhost:8000/auth/token?user_id=admin"

# Use the token
curl -H "Authorization: Bearer your-token" http://localhost:8000/docker/containers

LLM Integration (Natural Language)

Ollama (Default - Local & Free)

# Start Ollama
ollama serve
ollama pull mistral

# Run LLM integration
python src/llm_integration.py

OpenAI

export OPENAI_API_KEY="sk-..."
python src/llm_integration.py --openai

Example Commands

👤 You: Check the health of our infrastructure
🤖 Assistant: The infrastructure is healthy. All systems operational.

👤 You: List all running Docker containers
🤖 Assistant: Found 3 running containers: nginx, redis, postgres

👤 You: Scale the web deployment to 5 replicas
🤖 Assistant: Scaled deployment "web" to 5 replicas successfully.

👤 You: Show me the last 50 lines of logs from the api pod
🤖 Assistant: Here are the recent logs...

Environment Variables

# Authentication
MCP_API_KEY=your-api-key
MCP_JWT_SECRET=your-jwt-secret

# AWS
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your-key
AWS_SECRET_ACCESS_KEY=your-secret

# OpenAI (optional)
OPENAI_API_KEY=sk-...

# Grafana
GRAFANA_USER=admin
GRAFANA_PASSWORD=admin

Project Structure

mcp/
├── src/
│   ├── main_v2.py           # Main server with all features
│   ├── main.py              # Basic server (legacy)
│   ├── auth.py              # Authentication module
│   ├── monitoring.py        # Prometheus metrics & health checks
│   ├── llm_integration.py   # Ollama/OpenAI integration
│   └── devops/
│       ├── docker_ops.py    # Docker operations
│       ├── kubernetes_ops.py # Kubernetes operations
│       └── aws_ops.py       # AWS operations
├── monitoring/
│   ├── prometheus.yml       # Prometheus config
│   ├── alertmanager.yml     # Alertmanager config
│   └── grafana/
│       ├── provisioning/    # Grafana provisioning
│       └── dashboards/      # Grafana dashboards
├── docs/
│   ├── llm_integration.md   # LLM guide
│   └── devops_endpoints.md  # Endpoint reference
├── .github/workflows/
│   └── ci.yml               # CI/CD pipeline
├── docker-compose.yml       # Full stack deployment
├── Dockerfile               # Container build
├── Makefile                 # Convenience commands
├── requirements.txt         # Python dependencies
└── .env.example             # Environment template

Make Commands

make help         # Show all commands
make install      # Install dependencies
make dev          # Run development server
make run          # Run production server
make up           # Start all services (docker-compose)
make down         # Stop all services
make logs         # View logs
make test         # Run health check
make api-key      # Generate API key
make llm          # Run LLM integration

API Docs: http://localhost:8000/docs Health: http://localhost:8000/health Metrics: http://localhost:8000/metrics Prometheus: http://localhost:9090 (when using docker-compose) Grafana: http://localhost:3000 (admin/admin)

The server is running at http://localhost:8000 with all features active!

export MCP_API_KEY=$(curl -s -X POST "http://localhost:8000/auth/api-key?name=llm" | python -c "import sys,json; print(json.load(sys.stdin)['api_key'])") && python src/llm_integration.py

Stopping the Server

# If running in foreground (started without &)
# Press Ctrl+C

# If running in background (started with &)
pkill -f "uvicorn.*8000"

# Or find the PID and kill it
lsof -i :8000
kill <PID>

# If using Docker Compose
make down
# or: docker-compose down

Troubleshooting

Common Issues and Solutions

1. Server Not Responding / Hanging

Symptoms:

  • curl http://localhost:8000/health hangs indefinitely

  • API requests don't return

  • Server appears to be running but unresponsive

Solution:

# Kill any stuck uvicorn processes
pkill -f "uvicorn.*8000"

# Restart the server
source /path/to/venv/bin/activate
uvicorn src.main_v2:app --host 0.0.0.0 --port 8000

2. AttributeError: 'ScheduledJobRepository' object has no attribute 'list_all'

Symptoms:

  • Server fails to start

  • Error in startup event when loading scheduled jobs

Solution: Ensure src/database.py has the list_all() method in ScheduledJobRepository:

class ScheduledJobRepository:
    # ... other methods ...
    
    def list_all(self) -> List[Dict]:
        """List all scheduled jobs (alias for list method)."""
        return self.list()

3. RuntimeWarning: coroutine 'Scheduler.start' was never awaited

Symptoms:

  • Warning on server startup

  • Scheduler not running properly

  • Scheduled jobs not executing

Solution: In src/main_v2.py, ensure async functions are awaited in startup/shutdown:

@app.on_event("startup")
async def startup_event():
    await global_scheduler.start()  # Must use await
    # ... rest of startup

@app.on_event("shutdown")
async def shutdown_event():
    await global_scheduler.stop()   # Must use await

4. LLM Integration Timeout / Long Response Times

Symptoms:

  • LLM chat hangs or times out

  • Very slow responses from Ollama

  • httpx.ReadTimeout errors

Solution:

  1. Ollama Performance: Ollama on CPU can be slow. Consider:

    # Use a smaller model
    ollama pull phi
    # Or use GPU if available
  2. Timeout Configuration: The timeout is set to 300s in llm_integration.py. Adjust if needed:

    timeout=httpx.Timeout(300.0)  # Increase for slower systems
  3. Log Truncation: Large log outputs are automatically truncated to prevent token overflow.

5. Port Already in Use

Symptoms:

  • Address already in use error on startup

  • Can't bind to port 8000

Solution:

# Find process using port 8000
lsof -i :8000

# Kill it
kill -9 <PID>
# or
pkill -f "uvicorn.*8000"

6. Database Errors

Symptoms:

  • sqlite3.OperationalError: database is locked

  • Data not persisting

Solution:

# Check if multiple processes are accessing the DB
lsof mcp_data.db

# If locked, stop all server instances first
pkill -f "uvicorn"

# Then restart
uvicorn src.main_v2:app --host 0.0.0.0 --port 8000

7. Missing MCP_API_KEY Environment Variable

Symptoms:

  • LLM integration can't connect to MCP server

  • 401 Unauthorized errors

Solution:

# Generate and export API key in one command
export MCP_API_KEY=$(curl -s -X POST "http://localhost:8000/auth/api-key?name=llm" | python -c "import sys,json; print(json.load(sys.stdin)['api_key'])")

# Verify it's set
echo $MCP_API_KEY

# Then run LLM integration
python src/llm_integration.py

Health Check Commands

# Check if server is running
curl http://localhost:8000/health

# Check server info
curl http://localhost:8000/

# Check metrics
curl http://localhost:8000/metrics

# Test authentication
API_KEY=$(curl -s -X POST "http://localhost:8000/auth/api-key?name=test" | python -c "import sys,json; print(json.load(sys.stdin)['api_key'])")
curl -H "X-API-Key: $API_KEY" http://localhost:8000/docker/containers

Logs Location

  • Server logs: Console output from uvicorn

  • Audit logs: SQLite database (mcp_data.db → audit_logs table)

  • Docker logs: docker-compose logs (when using compose)

License

MIT License

F
license - not found
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/th1234th/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server