# AWS Billing MCP Server Design Document
## Overview
The AWS Billing MCP Server is a Node.js-based Model Context Protocol server optimized for Claude Desktop personal use. It provides comprehensive tools to analyze AWS billing data through natural language queries. The system uses a simplified architecture with direct environment variable credential access and stdio-based MCP communication.
## Architecture
The system uses a simplified modular architecture optimized for Claude Desktop:
```
┌─────────────────────────────────────────┐
│ MCP Protocol Layer (stdio) │
├─────────────────────────────────────────┤
│ Business Logic Layer │
├─────────────────────────────────────────┤
│ Data Access Layer │
├─────────────────────────────────────────┤
│ AWS Integration Layer (env vars) │
└─────────────────────────────────────────┘
```
### Key Architectural Decisions
1. **MCP Protocol Implementation**: Using the official `@modelcontextprotocol/sdk-typescript` with stdio transport for Claude Desktop
2. **Authentication**: Disabled by default for Claude Desktop personal use (optional Google OAuth2 for enterprise)
3. **Data Storage**: SQLite for billing data caching only (no credential storage)
4. **AWS Integration**: AWS SDK v3 with direct environment variable credential access
5. **Concurrency**: Async/await patterns with simplified connection handling
6. **Simplification**: No HTTP endpoints, no port conflicts, pure MCP stdio communication
## Components and Interfaces
### MCP Server Core (`MCPServer`)
- Handles MCP protocol communication
- Manages tool registration and execution
- Coordinates between authentication and business logic layers
### Authentication Manager (`AuthManager`)
- Optional Google OAuth2 flow (disabled by default for Claude Desktop)
- Manages JWT token lifecycle when authentication is enabled
- Validates user permissions and sessions for enterprise use
### AWS Billing Client (`AWSBillingClient`)
- Encapsulates AWS Cost Explorer API interactions
- Implements retry logic and rate limiting
- Uses environment variables directly (no credential storage/encryption)
- Automatic fallback to mock data when credentials not provided
### Data Repository (`BillingDataRepository`)
- Manages local SQLite database operations
- Implements caching strategies for billing data
- Provides query interface for cost analysis
### Billing Analyzer (`BillingAnalyzer`)
- Implements cost analysis algorithms
- Performs trend analysis and anomaly detection
- Calculates cost comparisons and aggregations
### Tool Registry (`ToolRegistry`)
- Registers and manages MCP tools
- Handles tool parameter validation
- Routes tool calls to appropriate business logic
## Data Models
### Billing Record
```typescript
interface BillingRecord {
id: string;
accountId: string;
service: string;
region: string;
usageType: string;
cost: number;
currency: string;
startDate: Date;
endDate: Date;
tags: Record<string, string>;
createdAt: Date;
updatedAt: Date;
}
```
### Cost Analysis Result
```typescript
interface CostAnalysisResult {
totalCost: number;
currency: string;
period: {
start: Date;
end: Date;
};
breakdown: {
service: string;
cost: number;
percentage: number;
}[];
trends: {
direction: 'increasing' | 'decreasing' | 'stable';
changePercent: number;
confidence: number;
};
}
```
### Authentication Session
```typescript
interface AuthSession {
userId: string;
email: string;
accessToken: string;
refreshToken: string;
expiresAt: Date;
permissions: string[];
}
```
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system-essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
Before defining the correctness properties, let me analyze the acceptance criteria for testability:
<function_calls>
<invoke name="prework">
<parameter name="featureName">aws-billing-mcp-server
After reviewing the acceptance criteria, I've identified several properties that can be consolidated to eliminate redundancy:
**Property Reflection:**
- Properties 1.1 and 1.3 can be combined into a comprehensive credential validation property
- Properties 2.2 and 2.5 both deal with data formatting and can be consolidated
- Properties 3.2, 5.1, 5.2 all deal with query filtering and can be combined
- Properties 4.2, 4.3, 4.4 all deal with authentication state management and can be consolidated
- Properties 6.1, 6.2, 6.3 all deal with error handling and can be combined
- Properties 7.1, 7.2, 7.3, 7.4 all deal with logging and can be consolidated
### Core Correctness Properties
**Property 1: Credential validation consistency**
*For any* AWS credential input (valid or invalid), the validation process should return consistent results based on the credential format and AWS API response, with proper error messages for invalid credentials and successful validation for valid ones
**Validates: Requirements 1.1, 1.3**
**Property 2: Credential encryption at rest**
*For any* AWS credentials stored by the system, the stored data should be encrypted and not readable in plaintext
**Validates: Requirements 1.4**
**Property 3: Multi-account unique identification**
*For any* set of AWS account configurations, each account should have a unique identifier and credentials should not conflict
**Validates: Requirements 1.5**
**Property 4: Data structure consistency**
*For any* billing data retrieved from AWS, the parsed and cached data should maintain consistent structure and be queryable through the analysis interface
**Validates: Requirements 2.2, 2.5**
**Property 5: Retry logic consistency**
*For any* AWS API failure scenario, the system should implement exponential backoff and retry logic consistently
**Validates: Requirements 2.3, 6.1**
**Property 6: Query filtering accuracy**
*For any* billing query with filters (time period, service, region), the returned results should contain only data matching all specified filter criteria
**Validates: Requirements 3.2, 5.1, 5.2**
**Property 7: Cost calculation accuracy**
*For any* cost comparison or trend analysis request, the calculated percentages and differences should be mathematically correct based on the input data
**Validates: Requirements 3.3, 5.3**
**Property 8: Anomaly detection consistency**
*For any* billing data set with defined baselines, anomaly detection should consistently identify deviations beyond the configured threshold
**Validates: Requirements 3.5**
**Property 9: Authentication state management**
*For any* authentication event (success, failure, expiration), the system should maintain consistent session state and require appropriate authentication actions
**Validates: Requirements 4.2, 4.3, 4.4**
**Property 10: Authorization validation**
*For any* user permission scenario, access control should be consistently enforced based on the configured permission rules
**Validates: Requirements 4.5**
**Property 11: Response format consistency**
*For any* billing data returned to LLM agents, the response should follow the structured JSON format specification
**Validates: Requirements 5.5**
**Property 12: Error handling consistency**
*For any* error condition (rate limits, network failures, invalid parameters), the system should handle errors gracefully with appropriate logging and user-friendly messages
**Validates: Requirements 6.1, 6.2, 6.3**
**Property 13: Comprehensive logging**
*For any* system operation (requests, errors, API calls, authentication events), appropriate log entries should be generated with required metadata
**Validates: Requirements 7.1, 7.2, 7.3, 7.4**
## Error Handling
The system implements comprehensive error handling across all layers:
### AWS Integration Errors
- **Rate Limiting**: Exponential backoff with jitter (base delay: 1s, max delay: 60s)
- **Network Failures**: Circuit breaker pattern with fallback to cached data
- **Authentication Errors**: Secure credential rotation and re-authentication flows
- **API Errors**: Structured error responses with correlation IDs for debugging
### Authentication Errors
- **OAuth Failures**: Graceful redirect to authentication flow with error context
- **Token Expiration**: Automatic refresh with fallback to re-authentication
- **Permission Denied**: Clear error messages without exposing system internals
- **Session Management**: Secure cleanup of expired sessions
### Data Processing Errors
- **Parsing Failures**: Validation with detailed error reporting for malformed data
- **Cache Errors**: Fallback to direct AWS API calls with performance logging
- **Query Errors**: Parameter validation with helpful error messages
- **Analysis Errors**: Graceful degradation with partial results when possible
## Testing Strategy
The testing approach combines unit testing and property-based testing to ensure comprehensive coverage:
### Unit Testing Framework
- **Framework**: Jest with TypeScript support
- **Coverage**: Minimum 80% code coverage for all modules
- **Mocking**: AWS SDK mocking using aws-sdk-client-mock
- **Integration**: Testcontainers for database integration tests
### Property-Based Testing Framework
- **Framework**: fast-check for TypeScript property-based testing
- **Configuration**: Minimum 100 iterations per property test
- **Generators**: Custom generators for AWS billing data, credentials, and query parameters
- **Shrinking**: Automatic test case minimization for failure analysis
### Testing Requirements
- Each correctness property must be implemented by a single property-based test
- Property-based tests must be tagged with the format: `**Feature: aws-billing-mcp-server, Property {number}: {property_text}**`
- Unit tests focus on specific examples, edge cases, and integration points
- Property tests verify universal properties across all valid inputs
- Both test types are complementary and required for comprehensive validation
### Test Data Management
- **Synthetic Data**: Generated test data matching AWS billing API schemas
- **Fixtures**: Predefined test scenarios for edge cases and error conditions
- **Isolation**: Each test runs with isolated database and AWS mock state
- **Cleanup**: Automatic cleanup of test resources and temporary data