TELEMETRY_EVENT_FORMAT.md•15.2 kB
# Telemetry Event Format Specification
**Document Version**: 1.0
**Last Updated**: September 20, 2025
**Related Tickets**: DXP-38, DXP-34, DXP-35, DXP-37
## Overview
This document defines the standardized event format for telemetry data generated by the Jaxon Digital Optimizely DXP MCP Server. All telemetry events are sent to analytics endpoints for aggregation and analysis.
## Design Principles
1. **Privacy First**: All data is anonymized - no personal information, credentials, or project-specific data
2. **Flat Structure**: Fields use flat naming (no nested objects) for consistent database storage
3. **Required Fields**: All events must include core identification and timestamp fields
4. **Backward Compatibility**: Legacy field names maintained where possible
5. **Validation**: All events validated before transmission
## Common Field Specifications
### Core Required Fields
All telemetry events **MUST** include these fields:
| Field | Type | Required | Description | Example |
|-------|------|----------|-------------|---------|
| `type` | string | ✅ | Event type identifier | `"session_start"`, `"tool_invocation"`, `"session_end"` |
| `timestamp` | string (ISO8601) | ✅ | Event timestamp in UTC | `"2025-09-20T12:34:56.789Z"` |
| `session_id` | string | ✅ | Anonymous session identifier (16 char hex) | `"36af2d17e6660256"` |
| `platform` | string | ✅ | Operating system platform | `"darwin"`, `"win32"`, `"linux"` |
### AI Client Fields
All events **MUST** include AI client identification:
| Field | Type | Required | Description | Example |
|-------|------|----------|-------------|---------|
| `ai_client` | string | ✅ | AI client name | `"claude_code"`, `"chatgpt"`, `"cursor"` |
| `ai_client_version` | string | ✅ | AI client version | `"1.0.0"`, `"4.0.2"` |
### Location Fields
All events **MUST** include geographic location (privacy-safe):
| Field | Type | Required | Description | Example |
|-------|------|----------|-------------|---------|
| `location_region` | string | ✅ | Geographic region | `"America"`, `"Europe"`, `"Asia"` |
| `location_timezone` | string | ✅ | System timezone | `"America/New_York"`, `"Europe/London"` |
| `location_country` | string | ✅ | Country code (ISO 3166-1 alpha-2) | `"US"`, `"GB"`, `"FR"` |
### Optional Common Fields
| Field | Type | Required | Description | Example |
|-------|------|----------|-------------|---------|
| `source` | string | ⚪ | Event source identifier | `"dxp-mcp"` |
| `version` | string | ⚪ | MCP server version | `"3.25.8"` |
## Event Types
### 1. Session Start Event (`session_start`)
Sent when an MCP session begins.
#### Additional Required Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `tool_name` | null | Always null for session events | `null` |
| `duration_ms` | number | Always 0 for session start | `0` |
#### Additional Optional Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `event` | object | System information | See example below |
#### Example
```json
{
"type": "session_start",
"timestamp": "2025-09-20T12:34:56.789Z",
"session_id": "36af2d17e6660256",
"platform": "darwin",
"ai_client": "claude_code",
"ai_client_version": "1.0.0",
"location_region": "America",
"location_timezone": "America/New_York",
"location_country": "US",
"source": "dxp-mcp",
"version": "3.25.8",
"tool_name": null,
"duration_ms": 0,
"event": {
"node_version": "v18.20.0",
"platform": "darwin",
"arch": "x64"
}
}
```
### 2. Tool Invocation Event (`tool_invocation`)
Sent when a tool is successfully invoked.
#### Additional Required Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `tool_name` | string | Name of the invoked tool | `"list_deployments"` |
| `duration_ms` | number | Tool execution time in milliseconds | `1250` |
#### Additional Optional Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `project_name` | string | Anonymous project identifier | `"project_abc123"` |
| `environment` | string | Target environment | `"Production"`, `"Integration"` |
| `event` | object | Tool execution details | See example below |
#### Example
```json
{
"type": "tool_invocation",
"timestamp": "2025-09-20T12:35:10.123Z",
"session_id": "36af2d17e6660256",
"platform": "darwin",
"ai_client": "claude_code",
"ai_client_version": "1.0.0",
"location_region": "America",
"location_timezone": "America/New_York",
"location_country": "US",
"source": "dxp-mcp",
"version": "3.25.8",
"tool_name": "list_deployments",
"duration_ms": 1250,
"project_name": "project_abc123",
"environment": "Production",
"event": {
"success": true,
"parameters": {
"environment": "Production",
"limit": 10
},
"tool": "list_deployments"
}
}
```
### 3. Tool Error Event (`tool_error`)
Sent when a tool invocation fails.
#### Additional Required Fields
Same as `tool_invocation` plus error fields:
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `error_type` | string | Categorized error type | `"API_ERROR"`, `"VALIDATION_ERROR"` |
| `error_code` | string | Specific error code | `"RATE_LIMITED"`, `"UNKNOWN"` |
#### Additional Optional Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `event.error_message` | string | Sanitized error message | `"Rate limit exceeded"` |
#### Example
```json
{
"type": "tool_error",
"timestamp": "2025-09-20T12:35:15.456Z",
"session_id": "36af2d17e6660256",
"platform": "darwin",
"ai_client": "claude_code",
"ai_client_version": "1.0.0",
"location_region": "America",
"location_timezone": "America/New_York",
"location_country": "US",
"tool_name": "deploy_package",
"duration_ms": 500,
"error_type": "API_ERROR",
"error_code": "RATE_LIMITED",
"event": {
"success": false,
"parameters": {
"packagePath": "[SANITIZED]",
"environment": "Production"
},
"tool": "deploy_package",
"error_message": "Rate limit exceeded - retry after 60 seconds"
}
}
```
### 4. Session End Event (`session_end`)
Sent when an MCP session terminates.
#### Additional Required Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `duration` | number | Total session duration in milliseconds | `45000` |
#### Additional Optional Fields
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `sessionId` | string | Legacy field for backward compatibility | `"36af2d17e6660256"` |
| `summary` | object | Session summary statistics | See example below |
#### Example
```json
{
"type": "session_end",
"timestamp": "2025-09-20T12:36:00.789Z",
"session_id": "36af2d17e6660256",
"sessionId": "36af2d17e6660256",
"platform": "darwin",
"ai_client": "claude_code",
"ai_client_version": "1.0.0",
"location_region": "America",
"location_timezone": "America/New_York",
"location_country": "US",
"duration": 45000,
"summary": {
"toolsUsed": 5,
"totalOperations": 12,
"errorCount": 1,
"topTools": [
{"name": "list_deployments", "count": 4},
{"name": "deploy_package", "count": 3}
],
"environment": {
"platform": "darwin",
"nodeVersion": "v18.20.0",
"arch": "x64"
}
}
}
```
## Session ID Generation
Session IDs are generated using a deterministic algorithm (DXP-35) to ensure consistency across MCP server restarts:
1. **Machine Identifier**: Hardware UUID (macOS), machine-id (Linux), registry GUID (Windows)
2. **User Context**: Home directory path
3. **Daily Rotation**: Current date (YYYY-MM-DD)
4. **Salt**: Fixed string "optimizely-dxp-mcp"
5. **Hash**: SHA-256 hash truncated to 16 hexadecimal characters
This approach provides:
- **Stability**: Same session ID within a day for the same user/machine
- **Privacy**: No personally identifiable information
- **Rotation**: Daily changes prevent long-term tracking
## AI Client Detection
The system automatically detects AI clients based on environment variables:
| AI Client | Detection Method | Environment Variables |
|-----------|------------------|----------------------|
| Claude Code | Primary detection | `CLAUDECODE`, `CLAUDE_CODE_ENTRYPOINT`, `CLAUDE_CODE_SSE_PORT` |
| Claude Desktop | Legacy detection | `CLAUDE_DESKTOP`, `CLAUDE_APP` |
| ChatGPT | OpenAI detection | `OPENAI_API_KEY`, `CHATGPT_AGENT`, `OPENAI_ORG_ID` |
| GitHub Copilot | Microsoft detection | `GITHUB_COPILOT`, `COPILOT_AGENT` |
| Cursor | Cursor IDE | `CURSOR`, `CURSOR_AGENT`, `CURSOR_IDE` |
| Windsurf | Windsurf IDE | `WINDSURF`, `WINDSURF_AGENT`, `WINDSURF_IDE` |
| Generic MCP | Fallback | `MCP_SERVER_NAME`, `MCP_CLIENT` |
| Unknown | Default | None of the above |
## Geographic Location Detection
Location information is derived from system settings (privacy-safe):
1. **Timezone**: From `Intl.DateTimeFormat().resolvedOptions().timeZone`
2. **Locale**: From system locale settings
3. **Country Code**: Extracted from locale (e.g., "US" from "en-US")
4. **Region**: Continent/region from timezone (e.g., "America" from "America/New_York")
**Privacy Note**: No IP addresses, GPS coordinates, or precise locations are collected.
## Validation Rules
### Event Validation
All events are validated before transmission:
1. **Required Fields**: All required fields must be present and non-null
2. **Type Validation**: Fields must match expected data types
3. **Format Validation**: Timestamps must be valid ISO8601, session_id must be 16-char hex
4. **Sanitization**: Parameter values and error messages are sanitized to remove sensitive data
### Field Validation Rules
| Field | Validation Rule |
|-------|----------------|
| `type` | Must be one of: `session_start`, `tool_invocation`, `tool_error`, `session_end` |
| `timestamp` | Valid ISO8601 UTC timestamp |
| `session_id` | 16-character lowercase hexadecimal string |
| `platform` | Valid OS platform string (`darwin`, `win32`, `linux`, etc.) |
| `ai_client` | Non-empty string |
| `tool_name` | Non-empty string (null only for session events) |
| `duration_ms` | Non-negative integer |
## Analytics Endpoints
### Primary Endpoint
- **URL**: `https://optimizely-mcp-analytics.vercel.app/api/telemetry/ingest`
- **Purpose**: Analytics dashboard with AI/geo visualization
- **Format**: Individual events as JSON
### Secondary Endpoint (Legacy)
- **URL**: `https://accelerator.jaxondigital.com/api/telemetry/mcp`
- **Purpose**: Aggregated statistics API
- **Format**: Batch events
## Error Handling
### Transmission Failures
When telemetry transmission fails:
1. **Retry Logic**: Automatic retry with exponential backoff (DXP-39 - Implemented)
- Base delay: 1 second
- Maximum delay: 60 seconds
- Backoff multiplier: 2x
- Jitter: ±25% to prevent thundering herd
- Maximum retries: 3 per event
- Retry interval: Every 30 seconds
2. **Local Storage**: Events stored locally in `os.tmpdir()/optimizely-mcp-telemetry/buffer.json`
- Maximum buffer size: 1000 events
- Persists across MCP server restarts
- Includes retry tracking metadata
3. **Circuit Breaker**: Temporarily disable telemetry if repeated failures (DXP-39 - Implemented)
- Opens after 5 consecutive failures
- Reset timeout: 5 minutes
- Events dropped while circuit is open
- Automatic recovery after timeout
4. **Health Monitoring**: Enhanced monitoring and alerting (DXP-40 - Implemented)
- Endpoint health checks every 5 minutes
- System health checks every minute
- Response time tracking and alerting
- Memory and buffer usage monitoring
- Health state persistence across restarts
5. **Graceful Degradation**: MCP server continues normal operation
- Telemetry failures never impact core functionality
- Silent failure with debug logging only
### Validation Failures
When event validation fails:
1. **Auto-correction**: Missing required fields added with defaults
2. **Logging**: Validation errors logged in debug mode
3. **Fallback Values**: Invalid fields replaced with safe defaults
4. **Continue Processing**: Invalid events don't block subsequent events
## Recent Changes
### DXP-40 (v3.25.12)
- Implemented enhanced health monitoring and alerting system
- Added endpoint health checks with response time tracking
- Implemented system health monitoring (memory, buffer usage)
- Health state persistence across MCP server restarts
- Event-driven health alerts with configurable thresholds
### DXP-39 (v3.25.10)
- Implemented telemetry buffering with retry logic
- Added exponential backoff with jitter for failed events
- Implemented circuit breaker pattern to prevent cascading failures
- Events now persist to disk and retry automatically
- Maximum 3 retry attempts per event with increasing delays
### DXP-38 (v3.25.9)
- Created comprehensive telemetry event format specification
- Fixed missing platform field in session_end events
- Added validation test suite for event compliance
### DXP-37 (v3.25.7-3.25.8)
- Converted nested ai_client objects to flat fields
- Converted nested location objects to flat fields
- Ensured consistent structure across all event types
- Added project switching improvements
### DXP-35 (v3.25.6)
- Implemented stable session ID generation
- Session IDs now persist across MCP server restarts
- Daily rotation maintains privacy while enabling accurate user counting
### DXP-34 (v3.25.4-3.25.5)
- Fixed tool name validation blocking all telemetry
- Added fallback to 'unknown_tool' for missing tool names
- Improved debug logging for tool name tracking
## Implementation Notes
### For Developers
1. **Never log sensitive data**: All telemetry is anonymized by design
2. **Validate before sending**: Use provided validation functions
3. **Handle failures gracefully**: Telemetry failures should not impact core functionality
4. **Test event structure**: Verify events match this specification
### For Analytics Dashboard
1. **Expect flat structure**: No nested objects in ai_client or location fields
2. **Handle missing fields**: Provide defaults for optional fields
3. **Validate event types**: Only process known event types
4. **Support legacy fields**: Some fields maintained for backward compatibility
## Future Enhancements
Planned improvements (refer to DXP board for status):
- Enhanced error categorization
- Performance metrics collection
- Real-time analytics streaming
- Telemetry compression for large event batches
### Completed Enhancements
- ✅ **DXP-40**: Enhanced health monitoring and alerting (v3.25.12)
- Endpoint health checks with response time tracking
- System health monitoring (memory, buffer usage)
- Event-driven health alerts
- Health state persistence
- ✅ **DXP-39**: Telemetry buffering and retry logic (v3.25.10)
- Exponential backoff with jitter
- Circuit breaker pattern
- Persistent event storage
- Automatic retry mechanism
---
**Document Maintenance**: This specification should be updated whenever telemetry event structure changes. All changes must be backward compatible unless explicitly versioned.