Skip to main content
Glama
dreamiurg

Datadog MCP Server

by dreamiurg

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
DD_SITENoDatadog site (e.g., datadoghq.com, datadoghq.eu). Defaults to datadoghq.comdatadoghq.com
LOG_LEVELNoMinimum log level: debug, info, warn, error. Defaults to infoinfo
DD_API_KEYYesYour Datadog API key
DD_APP_KEYYesYour Datadog Application key
LOG_FORMATNoLog output format: json or pretty. Defaults to jsonjson
DD_LOGS_SITENoOverride for logs API site
DD_METRICS_SITENoOverride for metrics API site

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
get-monitorsA

List Datadog monitors with filtering. Use for questions like 'show alerting monitors', 'what monitors are in warning state', or 'monitors tagged with team:platform'. Filter by groupStates: 'alert', 'warn', 'no data', 'ok'. Use get-monitor for a single monitor's full details.

get-monitorA

Get full details for a specific monitor by ID. Use after get-monitors to dive deeper into a specific monitor's configuration, thresholds, query, and current state. Returns complete monitor definition.

get-dashboardsA

List all Datadog dashboards. Use to answer 'what dashboards exist', 'find dashboard for API metrics', or to get dashboard IDs for get-dashboard. Returns dashboard names, IDs, and URLs.

get-dashboardA

Get full dashboard definition by ID. Returns all widgets, queries, and layout. Use after get-dashboards to explore a specific dashboard's contents and understand what metrics/data it displays.

get-metricsA

Search for available Datadog metrics by name pattern. Use to discover metrics like 'what CPU metrics exist' or 'find metrics for service X'. Parameter q searches metric names (e.g., q='aws.ec2' finds all EC2 metrics).

get-metric-metadataA

Get metadata for a specific metric name. Returns type (gauge/count/rate), unit, description, and integration. Use when you need to understand what a metric measures, e.g., 'what does system.cpu.user mean'.

get-eventsB

Query Datadog events within a time range. Events include deployments, alerts, configuration changes, and comments. Use for 'what happened yesterday', 'show deployment events', or correlating incidents with changes. Requires start/end as Unix timestamps.

get-incidentsB

List Datadog incidents for incident management. Use for 'show active incidents', 'what incidents happened this week', or 'find incidents related to payments'. Includes severity, status, commander, and timeline.

search-logsA

Search and retrieve log entries from Datadog. Use for 'find errors in auth service', 'show logs from last hour', or investigating issues. Query syntax: 'service:web-app status:error', time range: 'now-15m' to 'now'. Returns actual log messages. Use aggregate-logs for counts/stats instead.

aggregate-logsB

Compute statistics and aggregations on logs. Use for 'how many errors per service', 'count logs by status', or 'average response time from logs'. Supports count, avg, sum, min, max, percentiles. Use search-logs to see actual log content instead.

get-hostsA

List infrastructure hosts reporting to Datadog. Use for 'show production hosts', 'which hosts are muted', 'hosts running agent version X'. Returns host names, IPs, apps, agent info, and mute status. Essential for infrastructure visibility during incidents.

get-downtimesA

List scheduled maintenance downtimes in Datadog. Use for 'are there any active downtimes', 'what's scheduled for maintenance', 'why is this monitor muted'. Shows scope, schedule, and duration. Critical for on-call to understand muted monitors.

get-slosA

List Service Level Objectives (SLOs). Use for 'show all SLOs', 'SLOs for team platform', 'which SLOs are at risk'. Returns SLO names, targets, and current status. Use get-slo for detailed error budget and history of a specific SLO.

get-sloA

Get detailed SLO information by ID. Returns error budget remaining, burn rate, target vs actual, thresholds, and configured alerts. Use after get-slos to understand a specific SLO's health and history.

search-spansA

Search APM spans/traces. Use for 'find slow requests', 'show errors in payment service', or investigating latency. Query syntax: 'service:web status:error @duration:>1s'. Returns individual spans with trace IDs. Use get-trace for full trace context.

aggregate-spansA

Compute statistics on APM spans. Use for 'p99 latency by service', 'error rate per endpoint', 'request count over time'. Supports count, avg, sum, min, max, percentiles (pc75/90/95/99). Use search-spans to see actual span details.

get-servicesA

List all APM-instrumented services. Use to discover traced services, find service names for span queries, or get an overview of your distributed system. Returns service names and their environments.

get-traceA

Get all spans for a specific trace ID. Use after search-spans to see the full request flow across services. Returns all spans in the trace with timing, service, resource, and error information.

search-security-findingsB

List or search Datadog security findings (Cloud Security Management). Use to retrieve findings with a query and optional pagination cursor. Requires security_monitoring_findings_read or appsec_vm_read (OAuth apps still require security_monitoring_findings_read).

get-security-findingB

Get a legacy CSPM/CIEM finding by ID (posture_management). Note: this endpoint uses the legacy data model. Requires the security_monitoring_findings_read scope.

list-posture-findingsC

List legacy CSPM/CIEM posture management findings (misconfigurations and identity risks). Useful for compliance use-cases. Requires the security_monitoring_findings_read scope.

query-metricsA

Query time-series metric data from Datadog. The backbone of observability — use for 'CPU usage over last hour', 'request rate for web service', or any metric query. Query syntax: 'avg:system.cpu.user{host:web-1}'. Returns data points with timestamps.

get-synthetic-testsA

List Datadog Synthetic tests (API and browser). Use for 'show all synthetic tests', 'what API tests exist', or 'which tests are failing'. Returns test names, types, status, locations, and tags.

get-synthetic-resultsA

Get execution results for a specific Synthetic test. Use after get-synthetic-tests to see pass/fail history, response times, and probe locations. Returns individual check results with timing data.

search-rum-eventsA

Search Real User Monitoring (RUM) events. Use for 'frontend errors in production', 'slow page loads', 'user session analysis'. Query syntax similar to logs: '@type:error @application.id:abc'. Returns user sessions, views, actions, and errors.

list-rum-applicationsA

List all RUM applications configured in Datadog. Use to discover which frontend apps are monitored, get application IDs for RUM queries, or see who created them. Companion to search-rum-events.

search-error-tracking-eventsB

Search Error Tracking events across services. Use for 'what errors are happening in production', 'error groups for payment service', 'new errors this week'. Returns error groups with counts, first/last seen, and affected services.

get-containersA

List containers monitored by Datadog. Use for 'show running containers', 'containers for web service', 'container status by image'. Returns container names, images, tags, state, and start time.

get-host-tagsA

Get all tags associated with hosts. Use for 'what tags are on my hosts', 'which hosts have team:platform tag', or to understand host groupings. Returns a map of tag names to host lists.

get-audit-eventsA

Search Datadog organization audit events. Use for 'who changed this monitor', 'what config changes happened today', 'audit trail for user X'. Returns timestamped events with actor, action, and affected resource.

get-slo-historyA

Get historical SLO data over a time range. Use after get-slo to see 'SLO performance last 30 days', 'error budget consumption over time', or 'SLI trend for checkout service'. Returns SLI values, thresholds, and time range data.

get-notebooksA

List Datadog notebooks. Use for 'show investigation notebooks', 'find notebooks by team', or 'recent notebooks about outage'. Notebooks are collaborative documents used during incidents and investigations.

get-usageA

Get hourly usage data by product family. Use for 'how many infra hosts this month', 'log ingestion volume', 'APM usage trends'. Returns usage records with timestamps for billing and capacity planning.

get-log-pipelinesA

List all log processing pipelines. Use for 'how are logs being processed', 'which pipelines are active', 'what parsing rules exist'. Returns pipeline names, filters, processors, and enabled status. Essential for understanding log processing configuration.

get-log-indexesA

List all log indexes and their configuration. Use for 'where are logs being stored', 'what retention is configured', 'which logs are being excluded'. Returns index names, filters, retention days, daily limits, and exclusion filters.

get-dbm-samplesA

Get Database Monitoring query samples. Use for 'slow database queries', 'what queries are running on postgres', 'DB performance issues'. Returns query samples with execution time, affected rows, and database context.

aggregate-rum-eventsA

Aggregate RUM events with compute operations (count, avg, sum, min, max, percentile) and group-by facets. Use for 'RUM page load times by country', 'error count by browser', 'average session duration by app version'.

get-active-hosts-countB

Get total number of active and up hosts. Use for 'how many hosts are running', 'infrastructure host count', 'active host summary'.

list-processesB

List running processes with optional filtering by search term or tags. Use for 'what processes are running', 'find java processes', 'process list for host'.

list-service-definitionsA

List service definitions from the Datadog Service Catalog. Use for 'what services exist', 'service catalog', 'list all registered services'.

get-service-definitionA

Get a single service definition by name from the Service Catalog. Use for 'show service X details', 'what team owns service Y', 'service definition for Z'.

list-ci-pipelinesA

List CI pipeline events (pipeline runs/executions). Use for 'recent CI builds', 'failed pipelines', 'CI pipeline status', 'deployment history'.

get-ci-pipeline-eventsA

Aggregate CI pipeline analytics with compute operations. Use for 'average pipeline duration', 'failure rate by pipeline', 'CI performance trends'.

list-teamsA

List teams in the Datadog organization. Use for 'what teams exist', 'team structure', 'find team by name'.

list-usersA

List users in the Datadog organization. Use for 'who has access', 'list all users', 'find user by email'.

search-security-signalsB

Search security monitoring signals (threat detections, security alerts). Use for 'recent security alerts', 'threat detections', 'security signal search'.

list-dashboard-listsA

List all custom dashboard lists. Use for 'what dashboard lists exist', 'organized dashboards', 'dashboard collections'.

search-metric-volumesA

Search metrics by name pattern with volume and ingestion data. Use for 'find metrics matching pattern', 'metric ingestion volume', 'what metrics are configured'.

list-rolesA

List RBAC roles in your Datadog organization

list-permissionsA

List all available permissions in Datadog

get-logs-metricsA

Get all log-based metric configurations

get-spans-metricsB

Get all span-based metric configurations from APM

get-logs-archivesA

Get log archive configurations showing where logs are stored

get-service-dependenciesB

Get service dependency graph for APM services in a given environment

list-scorecard-rulesC

List service scorecard rules for evaluating service quality

list-scorecard-outcomesB

List scorecard rule evaluation outcomes for services

search-casesB

Search Datadog cases for incident investigation

get-powerpacksC

Get reusable dashboard widget templates (Powerpacks)

get-logs-indexesB

Get log index configurations including retention and exclusion filters

get-logs-pipelinesB

Get log processing pipeline configurations

search_audit_logsB

Search Datadog audit logs for configuration changes, user actions, and API calls

get_hourly_usageB

Get Datadog hourly usage by product family for cost analysis

list_containersC

List Datadog-monitored containers with their metadata and health status

search_error_tracking_issuesC

Search Datadog error tracking issues for user-facing errors and exceptions

get_error_tracking_issueB

Get details of a Datadog error tracking issue (user-facing error/exception) by ID

list_notebooksB

List Datadog notebooks (investigation documents, runbooks, postmortems)

get_notebookA

Get a specific Datadog notebook by ID with all cells and content

list_security_rulesC

List Datadog security monitoring detection rules

list_api_keysB

List Datadog API keys for key management and security audit

get_incident_todosA

Get action items/todos for a specific Datadog incident

list_aws_accountsB

List AWS accounts integrated with Datadog

list_network_devicesB

List network devices monitored by Datadog NDM with filtering and pagination

get_csm_coverageC

Get Cloud Security Management coverage across cloud accounts

search_incidentsA

Search Datadog incidents with advanced filtering by severity, status, and time range

list_cost_budgetsC

List cloud cost management budgets for tracking team spending

list_vulnerabilitiesB

List security vulnerability findings with filtering by tool, type, severity, and status

aggregate_network_connectionsC

Aggregate network connection analytics with grouping and filtering

list_workflowsC

List Datadog workflow automations for incident response and remediation

list_monitor_notification_rulesC

List monitor notification routing rules showing who gets alerted

get_top_avg_metricsC

Get top custom metrics by average hourly count for cost and cardinality analysis

list_csm_threats_agent_rulesC

List CSM Threats agent rules for workload security monitoring

list_dora_deploymentsC

List DORA metric deployments for tracking deployment frequency and lead time

list_fleet_agentsC

List Datadog fleet agents with version, OS, and status information

get_slo_correctionsB

List all SLO corrections (status adjustments) across your organization. Shows maintenance windows and planned downtime exclusions that affect SLO calculations.

get_ip_rangesA

Get Datadog IP ranges used by agents, APIs, APM, logs, process collection, synthetics, and webhooks. Useful for firewall/allowlist configuration.

get_logs_pipeline_orderA

Get the ordered list of log pipeline IDs, showing the processing order for log pipelines.

get_logs_archive_orderB

Get the ordered list of log archive IDs, showing the priority order for log archiving.

search_slosA

Search and filter SLOs by query string. Supports pagination and faceted search for finding specific SLOs by name, tags, or other attributes.

get_estimated_costC

Get estimated cost data for your Datadog usage. Filter by date range and view type (sub_org, summary). Useful for cost monitoring and budget planning.

list_synthetics_global_variablesA

List all Synthetics global variables used across synthetic tests for shared configuration like URLs, credentials, and test data.

list_downtime_schedulesC

List scheduled downtimes (v2 API). Filter by current/upcoming schedules. Shows muted monitors, scopes, and schedule details.

list_spans_metricsA

List all span-based metrics (APM custom metrics) configured for generating metrics from APM spans.

get_monitor_config_policiesA

Get monitor configuration policies that enforce tag and setting requirements on monitors across your organization.

list_synthetics_locationsA

List available Synthetics testing locations (both managed by Datadog and private). Useful for configuring where synthetic tests run.

list_logs_metricsA

List all log-based metrics configured for generating custom metrics from log data.

get_organizationA

Get your Datadog organization info including name, plan, public ID, and settings. Essential for understanding account configuration.

list_host_totalsA

Get the total number of active and up hosts in your Datadog account. Quick health check for infrastructure scale.

list_webhooksA

List all configured webhook integrations. Useful for auditing alert routing and notification channels.

list_synthetics_private_locationsA

List Synthetics private locations for internal testing. Shows private location IDs, names, and tags.

list_security_monitoring_rulesC

List security monitoring detection rules with pagination. Shows enabled/disabled rules, names, and types.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dreamiurg/datadog-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server