Datadog MCP Server
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| DD_SITE | No | Datadog site (e.g., datadoghq.com, datadoghq.eu). Defaults to datadoghq.com | datadoghq.com |
| LOG_LEVEL | No | Minimum log level: debug, info, warn, error. Defaults to info | info |
| DD_API_KEY | Yes | Your Datadog API key | |
| DD_APP_KEY | Yes | Your Datadog Application key | |
| LOG_FORMAT | No | Log output format: json or pretty. Defaults to json | json |
| DD_LOGS_SITE | No | Override for logs API site | |
| DD_METRICS_SITE | No | Override for metrics API site |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| get-monitorsA | List Datadog monitors with filtering. Use for questions like 'show alerting monitors', 'what monitors are in warning state', or 'monitors tagged with team:platform'. Filter by groupStates: 'alert', 'warn', 'no data', 'ok'. Use get-monitor for a single monitor's full details. |
| get-monitorA | Get full details for a specific monitor by ID. Use after get-monitors to dive deeper into a specific monitor's configuration, thresholds, query, and current state. Returns complete monitor definition. |
| get-dashboardsA | List all Datadog dashboards. Use to answer 'what dashboards exist', 'find dashboard for API metrics', or to get dashboard IDs for get-dashboard. Returns dashboard names, IDs, and URLs. |
| get-dashboardA | Get full dashboard definition by ID. Returns all widgets, queries, and layout. Use after get-dashboards to explore a specific dashboard's contents and understand what metrics/data it displays. |
| get-metricsA | Search for available Datadog metrics by name pattern. Use to discover metrics like 'what CPU metrics exist' or 'find metrics for service X'. Parameter q searches metric names (e.g., q='aws.ec2' finds all EC2 metrics). |
| get-metric-metadataA | Get metadata for a specific metric name. Returns type (gauge/count/rate), unit, description, and integration. Use when you need to understand what a metric measures, e.g., 'what does system.cpu.user mean'. |
| get-eventsB | Query Datadog events within a time range. Events include deployments, alerts, configuration changes, and comments. Use for 'what happened yesterday', 'show deployment events', or correlating incidents with changes. Requires start/end as Unix timestamps. |
| get-incidentsB | List Datadog incidents for incident management. Use for 'show active incidents', 'what incidents happened this week', or 'find incidents related to payments'. Includes severity, status, commander, and timeline. |
| search-logsA | Search and retrieve log entries from Datadog. Use for 'find errors in auth service', 'show logs from last hour', or investigating issues. Query syntax: 'service:web-app status:error', time range: 'now-15m' to 'now'. Returns actual log messages. Use aggregate-logs for counts/stats instead. |
| aggregate-logsB | Compute statistics and aggregations on logs. Use for 'how many errors per service', 'count logs by status', or 'average response time from logs'. Supports count, avg, sum, min, max, percentiles. Use search-logs to see actual log content instead. |
| get-hostsA | List infrastructure hosts reporting to Datadog. Use for 'show production hosts', 'which hosts are muted', 'hosts running agent version X'. Returns host names, IPs, apps, agent info, and mute status. Essential for infrastructure visibility during incidents. |
| get-downtimesA | List scheduled maintenance downtimes in Datadog. Use for 'are there any active downtimes', 'what's scheduled for maintenance', 'why is this monitor muted'. Shows scope, schedule, and duration. Critical for on-call to understand muted monitors. |
| get-slosA | List Service Level Objectives (SLOs). Use for 'show all SLOs', 'SLOs for team platform', 'which SLOs are at risk'. Returns SLO names, targets, and current status. Use get-slo for detailed error budget and history of a specific SLO. |
| get-sloA | Get detailed SLO information by ID. Returns error budget remaining, burn rate, target vs actual, thresholds, and configured alerts. Use after get-slos to understand a specific SLO's health and history. |
| search-spansA | Search APM spans/traces. Use for 'find slow requests', 'show errors in payment service', or investigating latency. Query syntax: 'service:web status:error @duration:>1s'. Returns individual spans with trace IDs. Use get-trace for full trace context. |
| aggregate-spansA | Compute statistics on APM spans. Use for 'p99 latency by service', 'error rate per endpoint', 'request count over time'. Supports count, avg, sum, min, max, percentiles (pc75/90/95/99). Use search-spans to see actual span details. |
| get-servicesA | List all APM-instrumented services. Use to discover traced services, find service names for span queries, or get an overview of your distributed system. Returns service names and their environments. |
| get-traceA | Get all spans for a specific trace ID. Use after search-spans to see the full request flow across services. Returns all spans in the trace with timing, service, resource, and error information. |
| search-security-findingsB | List or search Datadog security findings (Cloud Security Management). Use to retrieve findings with a query and optional pagination cursor. Requires security_monitoring_findings_read or appsec_vm_read (OAuth apps still require security_monitoring_findings_read). |
| get-security-findingB | Get a legacy CSPM/CIEM finding by ID (posture_management). Note: this endpoint uses the legacy data model. Requires the security_monitoring_findings_read scope. |
| list-posture-findingsC | List legacy CSPM/CIEM posture management findings (misconfigurations and identity risks). Useful for compliance use-cases. Requires the security_monitoring_findings_read scope. |
| query-metricsA | Query time-series metric data from Datadog. The backbone of observability — use for 'CPU usage over last hour', 'request rate for web service', or any metric query. Query syntax: 'avg:system.cpu.user{host:web-1}'. Returns data points with timestamps. |
| get-synthetic-testsA | List Datadog Synthetic tests (API and browser). Use for 'show all synthetic tests', 'what API tests exist', or 'which tests are failing'. Returns test names, types, status, locations, and tags. |
| get-synthetic-resultsA | Get execution results for a specific Synthetic test. Use after get-synthetic-tests to see pass/fail history, response times, and probe locations. Returns individual check results with timing data. |
| search-rum-eventsA | Search Real User Monitoring (RUM) events. Use for 'frontend errors in production', 'slow page loads', 'user session analysis'. Query syntax similar to logs: '@type:error @application.id:abc'. Returns user sessions, views, actions, and errors. |
| list-rum-applicationsA | List all RUM applications configured in Datadog. Use to discover which frontend apps are monitored, get application IDs for RUM queries, or see who created them. Companion to search-rum-events. |
| search-error-tracking-eventsB | Search Error Tracking events across services. Use for 'what errors are happening in production', 'error groups for payment service', 'new errors this week'. Returns error groups with counts, first/last seen, and affected services. |
| get-containersA | List containers monitored by Datadog. Use for 'show running containers', 'containers for web service', 'container status by image'. Returns container names, images, tags, state, and start time. |
| get-host-tagsA | Get all tags associated with hosts. Use for 'what tags are on my hosts', 'which hosts have team:platform tag', or to understand host groupings. Returns a map of tag names to host lists. |
| get-audit-eventsA | Search Datadog organization audit events. Use for 'who changed this monitor', 'what config changes happened today', 'audit trail for user X'. Returns timestamped events with actor, action, and affected resource. |
| get-slo-historyA | Get historical SLO data over a time range. Use after get-slo to see 'SLO performance last 30 days', 'error budget consumption over time', or 'SLI trend for checkout service'. Returns SLI values, thresholds, and time range data. |
| get-notebooksA | List Datadog notebooks. Use for 'show investigation notebooks', 'find notebooks by team', or 'recent notebooks about outage'. Notebooks are collaborative documents used during incidents and investigations. |
| get-usageA | Get hourly usage data by product family. Use for 'how many infra hosts this month', 'log ingestion volume', 'APM usage trends'. Returns usage records with timestamps for billing and capacity planning. |
| get-log-pipelinesA | List all log processing pipelines. Use for 'how are logs being processed', 'which pipelines are active', 'what parsing rules exist'. Returns pipeline names, filters, processors, and enabled status. Essential for understanding log processing configuration. |
| get-log-indexesA | List all log indexes and their configuration. Use for 'where are logs being stored', 'what retention is configured', 'which logs are being excluded'. Returns index names, filters, retention days, daily limits, and exclusion filters. |
| get-dbm-samplesA | Get Database Monitoring query samples. Use for 'slow database queries', 'what queries are running on postgres', 'DB performance issues'. Returns query samples with execution time, affected rows, and database context. |
| aggregate-rum-eventsA | Aggregate RUM events with compute operations (count, avg, sum, min, max, percentile) and group-by facets. Use for 'RUM page load times by country', 'error count by browser', 'average session duration by app version'. |
| get-active-hosts-countB | Get total number of active and up hosts. Use for 'how many hosts are running', 'infrastructure host count', 'active host summary'. |
| list-processesB | List running processes with optional filtering by search term or tags. Use for 'what processes are running', 'find java processes', 'process list for host'. |
| list-service-definitionsA | List service definitions from the Datadog Service Catalog. Use for 'what services exist', 'service catalog', 'list all registered services'. |
| get-service-definitionA | Get a single service definition by name from the Service Catalog. Use for 'show service X details', 'what team owns service Y', 'service definition for Z'. |
| list-ci-pipelinesA | List CI pipeline events (pipeline runs/executions). Use for 'recent CI builds', 'failed pipelines', 'CI pipeline status', 'deployment history'. |
| get-ci-pipeline-eventsA | Aggregate CI pipeline analytics with compute operations. Use for 'average pipeline duration', 'failure rate by pipeline', 'CI performance trends'. |
| list-teamsA | List teams in the Datadog organization. Use for 'what teams exist', 'team structure', 'find team by name'. |
| list-usersA | List users in the Datadog organization. Use for 'who has access', 'list all users', 'find user by email'. |
| search-security-signalsB | Search security monitoring signals (threat detections, security alerts). Use for 'recent security alerts', 'threat detections', 'security signal search'. |
| list-dashboard-listsA | List all custom dashboard lists. Use for 'what dashboard lists exist', 'organized dashboards', 'dashboard collections'. |
| search-metric-volumesA | Search metrics by name pattern with volume and ingestion data. Use for 'find metrics matching pattern', 'metric ingestion volume', 'what metrics are configured'. |
| list-rolesA | List RBAC roles in your Datadog organization |
| list-permissionsA | List all available permissions in Datadog |
| get-logs-metricsA | Get all log-based metric configurations |
| get-spans-metricsB | Get all span-based metric configurations from APM |
| get-logs-archivesA | Get log archive configurations showing where logs are stored |
| get-service-dependenciesB | Get service dependency graph for APM services in a given environment |
| list-scorecard-rulesC | List service scorecard rules for evaluating service quality |
| list-scorecard-outcomesB | List scorecard rule evaluation outcomes for services |
| search-casesB | Search Datadog cases for incident investigation |
| get-powerpacksC | Get reusable dashboard widget templates (Powerpacks) |
| get-logs-indexesB | Get log index configurations including retention and exclusion filters |
| get-logs-pipelinesB | Get log processing pipeline configurations |
| search_audit_logsB | Search Datadog audit logs for configuration changes, user actions, and API calls |
| get_hourly_usageB | Get Datadog hourly usage by product family for cost analysis |
| list_containersC | List Datadog-monitored containers with their metadata and health status |
| search_error_tracking_issuesC | Search Datadog error tracking issues for user-facing errors and exceptions |
| get_error_tracking_issueB | Get details of a Datadog error tracking issue (user-facing error/exception) by ID |
| list_notebooksB | List Datadog notebooks (investigation documents, runbooks, postmortems) |
| get_notebookA | Get a specific Datadog notebook by ID with all cells and content |
| list_security_rulesC | List Datadog security monitoring detection rules |
| list_api_keysB | List Datadog API keys for key management and security audit |
| get_incident_todosA | Get action items/todos for a specific Datadog incident |
| list_aws_accountsB | List AWS accounts integrated with Datadog |
| list_network_devicesB | List network devices monitored by Datadog NDM with filtering and pagination |
| get_csm_coverageC | Get Cloud Security Management coverage across cloud accounts |
| search_incidentsA | Search Datadog incidents with advanced filtering by severity, status, and time range |
| list_cost_budgetsC | List cloud cost management budgets for tracking team spending |
| list_vulnerabilitiesB | List security vulnerability findings with filtering by tool, type, severity, and status |
| aggregate_network_connectionsC | Aggregate network connection analytics with grouping and filtering |
| list_workflowsC | List Datadog workflow automations for incident response and remediation |
| list_monitor_notification_rulesC | List monitor notification routing rules showing who gets alerted |
| get_top_avg_metricsC | Get top custom metrics by average hourly count for cost and cardinality analysis |
| list_csm_threats_agent_rulesC | List CSM Threats agent rules for workload security monitoring |
| list_dora_deploymentsC | List DORA metric deployments for tracking deployment frequency and lead time |
| list_fleet_agentsC | List Datadog fleet agents with version, OS, and status information |
| get_slo_correctionsB | List all SLO corrections (status adjustments) across your organization. Shows maintenance windows and planned downtime exclusions that affect SLO calculations. |
| get_ip_rangesA | Get Datadog IP ranges used by agents, APIs, APM, logs, process collection, synthetics, and webhooks. Useful for firewall/allowlist configuration. |
| get_logs_pipeline_orderA | Get the ordered list of log pipeline IDs, showing the processing order for log pipelines. |
| get_logs_archive_orderB | Get the ordered list of log archive IDs, showing the priority order for log archiving. |
| search_slosA | Search and filter SLOs by query string. Supports pagination and faceted search for finding specific SLOs by name, tags, or other attributes. |
| get_estimated_costC | Get estimated cost data for your Datadog usage. Filter by date range and view type (sub_org, summary). Useful for cost monitoring and budget planning. |
| list_synthetics_global_variablesA | List all Synthetics global variables used across synthetic tests for shared configuration like URLs, credentials, and test data. |
| list_downtime_schedulesC | List scheduled downtimes (v2 API). Filter by current/upcoming schedules. Shows muted monitors, scopes, and schedule details. |
| list_spans_metricsA | List all span-based metrics (APM custom metrics) configured for generating metrics from APM spans. |
| get_monitor_config_policiesA | Get monitor configuration policies that enforce tag and setting requirements on monitors across your organization. |
| list_synthetics_locationsA | List available Synthetics testing locations (both managed by Datadog and private). Useful for configuring where synthetic tests run. |
| list_logs_metricsA | List all log-based metrics configured for generating custom metrics from log data. |
| get_organizationA | Get your Datadog organization info including name, plan, public ID, and settings. Essential for understanding account configuration. |
| list_host_totalsA | Get the total number of active and up hosts in your Datadog account. Quick health check for infrastructure scale. |
| list_webhooksA | List all configured webhook integrations. Useful for auditing alert routing and notification channels. |
| list_synthetics_private_locationsA | List Synthetics private locations for internal testing. Shows private location IDs, names, and tags. |
| list_security_monitoring_rulesC | List security monitoring detection rules with pagination. Shows enabled/disabled rules, names, and types. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/dreamiurg/datadog-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server