Skip to main content
Glama
bierbios

MCP Server ELK

by bierbios

MCP Server ELK (Read-Only, Production-Oriented)

MCP Server berbasis Python 3.12 + FastAPI untuk analisa ELK Stack secara aman (read-only) dan kompatibel OpenClaw.

1) Arsitektur Mermaid

flowchart TB
    OC[OpenClaw] -->|X-API-Key| API[FastAPI MCP Endpoint]
    API --> SEC[Security Layer\nAPI Key Auth + RBAC + Rate Limit]
    SEC --> REG[Tool Registry\nDiscovery + Execute + Schema Validation]
    REG --> TOOLS[MCP Tools\nELK Cluster/Logs/Kibana/Logstash/Filebeat/APM/Recommendation]
    TOOLS --> CTRL[Controllers]
    CTRL --> SRV[Services\nBusiness Logic]
    SRV --> REPO[Repositories\nRead-Only Data Access]
    REPO --> ES[(Elasticsearch)]
    REPO --> KB[(Kibana API)]
    REPO --> LS[(Logstash Monitoring API)]

    API --> AUDIT[Structured JSON Audit Log]
    API --> METRICS[Prometheus Metrics /metrics]

    classDef safe fill:#e7f7ef,stroke:#1f8f5f,stroke-width:1px;
    class SEC,REG,TOOLS,AUDIT,METRICS safe;

2) Struktur Folder

mcpserver-elk/
├── app/
│   ├── main.py
│   ├── api/routes/
│   │   ├── health_controller.py
│   │   ├── mcp_controller.py
│   │   └── metrics_controller.py
│   ├── core/
│   │   ├── config.py
│   │   ├── exceptions.py
│   │   ├── logging.py
│   │   ├── masking.py
│   │   ├── metrics.py
│   │   ├── rate_limit.py
│   │   └── security.py
│   ├── mcp/
│   │   ├── registry.py
│   │   ├── schemas.py
│   │   ├── server.py
│   │   └── tool_base.py
│   ├── models/
│   │   ├── common_models.py
│   │   ├── elk_models.py
│   │   ├── mcp_models.py
│   │   └── schemas.py
│   ├── controllers/
│   │   ├── elk_controller.py
│   │   └── recommendation_controller.py
│   ├── services/
│   │   ├── apm_service.py
│   │   ├── elk_cluster_service.py
│   │   ├── elk_logs_service.py
│   │   ├── filebeat_service.py
│   │   ├── kibana_service.py
│   │   ├── logstash_service.py
│   │   └── recommendation_service.py
│   ├── repositories/
│   │   ├── apm_repository.py
│   │   ├── elasticsearch_repository.py
│   │   ├── kibana_repository.py
│   │   └── logstash_repository.py
│   ├── clients/
│   │   ├── elasticsearch_client.py
│   │   └── http_client.py
│   ├── tools/
│   │   ├── elk_apm.py
│   │   ├── elk_cluster.py
│   │   ├── elk_cluster_tools.py
│   │   ├── elk_filebeat.py
│   │   ├── elk_kibana.py
│   │   ├── elk_logs.py
│   │   ├── elk_logs_tools.py
│   │   ├── elk_logstash.py
│   │   ├── filebeat_tools.py
│   │   ├── kibana_tools.py
│   │   ├── logstash_tools.py
│   │   ├── recommendation.py
│   │   └── recommendation_tools.py
│   └── utils/
│       ├── query_builder.py
│       ├── response_limiter.py
│       └── time_range.py
├── tests/
│   ├── conftest.py
│   ├── integration/
│   └── unit/
├── k8s/
│   ├── configmap.yaml
│   ├── deployment.yaml
│   ├── hpa.yaml
│   ├── ingress.yaml
│   ├── namespace.yaml
│   ├── networkpolicy.yaml
│   ├── pdb.yaml
│   ├── secret.yaml
│   ├── service.yaml
│   └── serviceaccount.yaml
├── .env.example
├── Dockerfile
├── docker-compose.yml
├── pyproject.toml
├── requirements.txt
└── README.md

3) Fitur Security & Safety

  • Read-only by default, tidak ada endpoint write/delete/restart.

  • API key auth via X-API-Key.

  • RBAC per tool (elk_viewer, elk_operator, elk_admin_readonly).

  • Allowlist index pattern (ALLOWED_INDEX_PATTERNS).

  • Denylist dangerous query (script, painless, delete_by_query, dsb).

  • Timeout + retry terbatas untuk ES/HTTP API.

  • Rate limit per API key.

  • Audit log sebelum/sesudah eksekusi tool.

  • Structured JSON logging.

  • Masking secret (password/token/api_key/authorization/cookie).

  • Response size limiter (MAX_RESPONSE_BYTES).

  • TLS verification aktif default.

4) Endpoint

  • GET /healthz

  • GET /readyz

  • GET /metrics

  • GET /metrics/json

  • GET /mcp/tools

  • POST /mcp/execute

  • POST /mcp (JSON-RPC)

5) Tools MCP Wajib

  • elk_cluster_health

  • elk_nodes_stats

  • elk_indices_summary

  • elk_search_logs

  • elk_detect_errors

  • elk_logstash_health

  • elk_filebeat_status

  • elk_kibana_status

  • elk_apm_summary

  • elk_recommend_fix

6) Konfigurasi Environment

Gunakan file .env.example:

cp .env.example .env

7) Jalankan Lokal

python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload

8) Docker

Build & run:

docker build -t mcpserver-elk:1.0.0 .
docker run --rm -p 8080:8080 --env-file .env mcpserver-elk:1.0.0

Docker Compose lab (dengan sample ELK):

docker compose --profile lab up -d --build

9) Kubernetes Deploy

kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/secret.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/serviceaccount.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/ingress.yaml
kubectl apply -f k8s/networkpolicy.yaml
kubectl apply -f k8s/hpa.yaml
kubectl apply -f k8s/pdb.yaml

10) OpenClaw Integration

Contoh konfigurasi OpenClaw (contoh JSON):

{
  "mcpServers": [
    {
      "name": "elk-prod-readonly",
      "url": "https://mcp-elk.example.com/mcp",
      "headers": {
        "X-API-Key": "ops-key"
      },
      "timeoutSeconds": 30,
      "tools": [
        "elk_cluster_health",
        "elk_nodes_stats",
        "elk_indices_summary",
        "elk_search_logs",
        "elk_detect_errors",
        "elk_logstash_health",
        "elk_filebeat_status",
        "elk_kibana_status",
        "elk_apm_summary",
        "elk_recommend_fix"
      ]
    }
  ]
}

Contoh prompt OpenClaw:

Gunakan MCP Server ELK untuk cek cluster health Elasticsearch, cari error log service payment-service dalam 1 jam terakhir, kelompokkan error terbanyak, analisa root cause, dan berikan rekomendasi perbaikan yang aman.

11) Contoh Request/Response MCP

List tools:

curl -sS -H "X-API-Key: dev-key" http://localhost:8080/mcp/tools | jq

Execute elk_cluster_health:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_cluster_health","input":{"include_shards":true}}' | jq

Execute elk_nodes_stats:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_nodes_stats","input":{"include_thread_pool":false}}' | jq

Execute elk_indices_summary:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_indices_summary","input":{"index_pattern":"logs-*","sort_by":"size","limit":20}}' | jq

Execute elk_search_logs:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_search_logs","input":{"index_pattern":"logs-*","start_time":"now-1h","end_time":"now","service_name":"payment-service","log_level":"error","limit":20}}' | jq

Execute elk_detect_errors:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_detect_errors","input":{"index_pattern":"logs-*","start_time":"now-1h","end_time":"now","service_name":"payment-service","top_n":10}}' | jq

Execute elk_logstash_health:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ops-key" \
  -d '{"tool_name":"elk_logstash_health","input":{"pipeline_id":"main"}}' | jq

Execute elk_filebeat_status:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_filebeat_status","input":{"index_pattern":"filebeat-*","max_delay_minutes":5}}' | jq

Execute elk_kibana_status:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_kibana_status","input":{"include_plugins":true}}' | jq

Execute elk_apm_summary:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_apm_summary","input":{"service_name":"payment-service","start_time":"now-1h","end_time":"now"}}' | jq

Execute elk_recommend_fix:

curl -sS -X POST http://localhost:8080/mcp/execute \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"tool_name":"elk_recommend_fix","input":{"findings":{"cluster_health":{"status":"yellow","metrics":{"unassigned_shards":2}}}}}' | jq

Contoh JSON-RPC:

curl -sS -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -H "X-API-Key: dev-key" \
  -d '{"jsonrpc":"2.0","id":"1","method":"mcp.list_tools","params":{}}' | jq

Contoh response elk_cluster_health:

{
  "ok": true,
  "tool_name": "elk_cluster_health",
  "data": {
    "status": "yellow",
    "summary": "Cluster prod-elk status=yellow, nodes=6, unassigned_shards=2",
    "metrics": {
      "cluster_name": "prod-elk",
      "number_of_nodes": 6,
      "active_shards": 1240,
      "relocating_shards": 0,
      "initializing_shards": 0,
      "unassigned_shards": 2
    },
    "recommendation": [
      "Periksa replica shard yang belum ter-assign.",
      "Jalankan analisa allocation explain untuk shard unassigned (read-only)."
    ]
  }
}

Contoh response elk_detect_errors:

{
  "ok": true,
  "tool_name": "elk_detect_errors",
  "data": {
    "total_errors": 182,
    "errors_by_service": [
      {"service": "payment-service", "count": 145}
    ],
    "top_error_messages": [
      {"message": "timeout to fraud-service", "count": 72}
    ],
    "samples": [
      {"timestamp": "2026-04-26T01:25:00Z", "service": "payment-service", "log_level": "error", "message": "timeout to fraud-service", "trace_id": "abc"}
    ],
    "recommendation": [
      "Validasi error paling sering dengan trace_id untuk korelasi lintas service."
    ]
  }
}

12) Testing

Run semua test:

pytest -q

Script Uji Coba Cepat

Seed data simulasi:

chmod +x scripts/*.sh
./scripts/seed_data.sh

Smoke test end-to-end:

./scripts/smoke_test.sh

Contoh dengan custom endpoint/key:

MCP_BASE_URL=http://localhost:8080 \
MCP_VIEWER_KEY=dev-key \
MCP_OPERATOR_KEY=ops-key \
./scripts/smoke_test.sh

Run lint/type:

ruff check .
mypy app

Test yang sudah disediakan

  • Unit test tool registry.

  • Unit test RBAC.

  • Unit test secret masking.

  • Unit test Elasticsearch query builder.

  • Integration test MCP + mock Elasticsearch.

  • Integration test MCP + mock Kibana.

  • Integration test MCP + mock Logstash.

Smoke Test Checklist

  • GET /healthz mengembalikan 200.

  • GET /readyz status ready saat ES up.

  • GET /mcp/tools mengembalikan daftar tool.

  • POST /mcp/execute dengan key valid berhasil.

  • POST /mcp/execute dengan key invalid mengembalikan 401.

  • Tool elk_logstash_health dengan role viewer ditolak (403).

  • Query berbahaya ditolak.

  • Response besar ditolak (413) bila melewati limit.

  • /metrics dapat di-scrape Prometheus.

13) Troubleshooting Guide

Masalah

Gejala

Kemungkinan Penyebab

Command Pengecekan

Solusi Aman

OpenClaw tidak bisa connect MCP Server

Timeout/connection refused

DNS/Ingress/Service salah

kubectl get ingress -n mcpserver-elk

Perbaiki host/path Ingress dan Service port

401 API key invalid

Response authentication_failed

X-API-Key salah/tidak dikirim

curl -i http://host/mcp/tools

Update key di OpenClaw, sinkronkan Secret

403 RBAC denied

Response permission_denied

Role tidak punya akses tool

curl ... /mcp/execute

Gunakan API key role tepat atau sesuaikan policy

Elasticsearch TLS error

CERTIFICATE_VERIFY_FAILED

CA cert salah/expired

openssl s_client -connect es:9200 -showcerts

Mount CA valid, aktifkan verify TLS

Elasticsearch authentication failed

401 dari ES

User/password salah

curl -u user:pass https://es:9200/_cluster/health

Rotasi secret kredensial readonly

index pattern denied

403 pattern not allowed

Pattern di luar allowlist

cek ALLOWED_INDEX_PATTERNS

Tambah pattern aman di allowlist

query timeout

504 timeout

Query berat / cluster sibuk

GET /_tasks?detailed=true&actions=*search

Kecilkan range waktu, turunkan limit, optimasi index

response too large

413 response_too_large

Hasil terlalu besar

cek MAX_RESPONSE_BYTES

Kurangi limit/filter, naikkan limit secara terukur

Kibana 401

Tool kibana gagal auth

User Kibana salah

curl -u user:pass https://kibana/api/status -H 'kbn-xsrf:true'

Pakai akun readonly Kibana valid

Kibana status unavailable

status degraded/down

Kibana/ES backend issue

curl https://kibana/api/status

Cek koneksi Kibana -> Elasticsearch

Logstash monitoring API mati

tool logstash error 502

Port 9600 down/firewall

curl http://logstash:9600/_node/stats

Aktifkan monitoring API / perbaiki network

Filebeat delay ingestion tinggi

delayed_hosts meningkat

Agent terputus/backpressure

GET filebeat-*/_search

Cek output beat, network, queue Logstash

cluster yellow

status yellow

Replica belum teralokasi

GET /_cluster/health + GET /_cat/shards?v

Tambah node/disk, cek allocation rule

cluster red

status red

Primary shard unassigned

GET /_cluster/allocation/explain

Prioritaskan recovery shard primary

shard unassigned

unassigned_shards > 0

Disk watermark/node down/filter allocation

GET /_cluster/allocation/explain

Bebaskan disk, perbaiki node, cek awareness setting

disk watermark exceeded

shard tidak bisa allocate

Disk penuh > watermark

GET /_cat/allocation?v

Tambah kapasitas, ILM cleanup, rebalance

JVM heap tinggi

heap > 80%

Query/aggs berat, shard terlalu banyak

GET /_nodes/stats/jvm

Optimasi query, kurangi shard, tuning heap

Logstash pipeline stuck

events in naik, out stagnan

Output blocked / queue penuh

GET http://logstash:9600/_node/stats

Cek output plugin, perbesar worker/queue dengan aman

14) Production Checklist

  • Elasticsearch user sudah read-only.

  • TLS certificate valid dan verify aktif.

  • API key dirotasi berkala.

  • RBAC per tool aktif.

  • Audit log aktif (JSON).

  • Prometheus scrape /metrics aktif.

  • Dashboard Grafana tersedia.

  • NetworkPolicy aktif.

  • Resource request/limit aktif.

  • HPA aktif.

  • Secret tidak muncul di log.

  • Dangerous query ditolak.

  • Backup config/manifest tersedia.

  • CI/CD security scan aktif.

15) Contoh Bamboo Pipeline (CI/CD)

---
version: 2
plan:
  project-key: MCP
  key: ELK
  name: mcpserver-elk

stages:
  - Build & Test:
      jobs:
        - lint-test-build

jobs:
  - lint-test-build:
      docker:
        image: python:3.12-slim
      tasks:
        - script: |
            python -m pip install --upgrade pip
            pip install -r requirements.txt
            ruff check .
            mypy app
            pytest -q
        - script: |
            docker build -t new-nexus.bri.co.id/mcp/dev/mcpserver-elk:1.0.0 .
        - script: |
            trivy image --exit-code 1 new-nexus.bri.co.id/mcp/dev/mcpserver-elk:1.0.0
        - script: |
            docker login new-nexus.bri.co.id -u "$NEXUS_USER" -p "$NEXUS_PASS"
            docker push new-nexus.bri.co.id/mcp/dev/mcpserver-elk:1.0.0
        - script: |
            kubectl apply -f k8s/
            kubectl rollout status deploy/mcpserver-elk -n mcpserver-elk
        - script: |
            curl -fsS https://mcp-elk.example.com/healthz
            curl -fsS https://mcp-elk.example.com/readyz

16) Catatan Enterprise

  • Gunakan HTTPS end-to-end (Ingress TLS + upstream TLS).

  • Simpan secret di secret manager (Vault/KMS/ExternalSecret), bukan plaintext di repo.

  • Gunakan image signing + SBOM untuk compliance.

  • Pastikan user Elasticsearch memiliki role read-only (monitor, read, tanpa write/manage).

F
license - not found
-
quality - not tested
-
maintenance - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bierbios/mcpserver'

If you have feedback or need assistance with the MCP directory API, please join our Discord server