# Architecture
This document describes the architecture of the GitHub MCP Control Plane.
## Overview
The GitHub MCP Control Plane is a stateless, serverless application built on Cloudflare Workers. It implements the Model Context Protocol (MCP) to provide secure, controlled access to GitHub operations.
## Design Principles
1. **Stateless**: Each request is independent; no in-memory state is maintained between requests
2. **Security-First**: Multi-layer security validation at every stage
3. **Idempotent**: All operations can be safely retried
4. **Observable**: Complete audit trail with correlation IDs
5. **Scalable**: Distributed rate limiting and zero-idle cost
## System Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Client Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CLI │ │ Web UI │ │ Bot │ │ Custom │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
└───────┼────────────┼────────────┼────────────┼────────────┘
│ │ │ │
└────────────┴────────────┴────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Cloudflare Workers Layer │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Entry Point (entrypoint.js) │ │
│ │ - Request routing │ │
│ │ - Health checks │ │
│ │ - Error handling │ │
│ └────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Request Handler (request-handler.js) │ │
│ │ - Authentication │ │
│ │ - Rate limiting │ │
│ │ - Validation pipeline │ │
│ │ - Tool orchestration │ │
│ │ - Response formatting │ │
│ └────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Auth │ │ Validate │ │ Tools │ │
│ │ Layer │ │ Layer │ │ Layer │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
└───────┼──────────────┼──────────────┼────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Cloudflare │ │ Cloudflare │ │ Cloudflare │
│ KV │ │ Analytics │ │ Secrets │
└──────────────┘ └──────────────┘ └──────────────┘
│ │
└────────────┬───────────────────────┘
│
┌────────────┼────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ GitHub │ │ GitHub │ │ GitHub │
│ API │ │ Actions │ │ API │
│ (Read) │ │ (Execute) │ │ (Write) │
└───────────┘ └───────────┘ └───────────┘
```
## Component Architecture
### 1. Entry Point (`entrypoint.js`)
**Responsibilities:**
- Receive incoming HTTP requests
- Parse request bodies
- Route to appropriate handlers
- Handle health check endpoints
- Global error handling
**Key Features:**
- Correlation ID generation for all requests
- Request logging with timing
- Structured error responses
### 2. Request Handler (`request-handler.js`)
**Responsibilities:**
- Orchestrate complete request flow
- Coordinate all validation and execution layers
- Format responses according to MCP protocol
**Request Flow:**
1. Rate limiting check
2. MCP request parsing
3. Authentication
4. Schema validation
5. Security scanning
6. Permission checking
7. Policy enforcement
8. Dependency checking
9. Tool execution
10. Audit logging
11. Response formatting
### 3. Authentication Layer
#### Token Handler (`auth/token-handler.js`)
- JWT token validation
- GitHub token extraction
- Token expiration checking
- Token caching
#### Permission Checker (`auth/permission-checker.js`)
- Repository access verification
- Branch permission checking
- Organization permission verification
- Permission caching
### 4. Validation Layer
#### Schema Validator (`validation/schema-validator.js`)
- JSON schema validation using AJV
- Schema compilation and caching
- Detailed error reporting
- Schema reloading support
#### Security Scanner (`validation/security-scanner.js`)
- Secret detection (regex + entropy)
- Suspicious code pattern detection
- Vulnerability signature matching
- Multi-layer security checks
#### Policy Enforcer (`validation/policy-enforcer.js`)
- Repository-level access control
- Branch protection rules
- File path restrictions
- User-based permissions
- Time-based restrictions
#### Dependency Checker (`validation/dependency-checker.js`)
- OSV.dev integration for vulnerability checking
- Multi-package manager support (npm, pip, maven)
- Caching for performance
- Detailed vulnerability reports
### 5. Tools Layer
#### Read-Only Tools (`tools/read-only.js`)
- `list_repositories`: List accessible repositories
- `fetch_file`: Fetch file contents
- `list_files`: List files in directory
- `get_repository_info`: Get repository metadata
#### Write-Controlled Tools (`tools/write-controlled.js`)
- `create_branch`: Create new branches
- `create_commit`: Create commits with file changes
- `batch_create_commits`: Handle 100+ files in batches
#### Workflow Tools (`tools/workflow.js`)
- `trigger_workflow`: Trigger GitHub Actions workflows
- `get_workflow_status`: Check workflow execution status
- `get_workflow_logs`: Fetch workflow execution logs
### 6. Execution Layer
#### Git Operations (`execution/git-operations.js`)
- Low-level Git operations via GitHub API
- Reference operations (branches, tags)
- Tree operations (directories)
- Commit operations
- Blob operations (files)
#### Batch Processor (`execution/batch-processor.js`)
- Intelligent file batching
- Progress tracking
- Error handling with rollback
- Optimal batch size calculation
#### Rollback Handler (`execution/rollback-handler.js`)
- Commit rollback capabilities
- Rollback branch creation
- Rollback verification
- Rollback history tracking
### 7. Utilities
#### Logger (`utils/logger.js`)
- Structured JSON logging
- Log level filtering
- Consistent log formatting
- Environment-aware configuration
#### Audit Trail (`utils/audit-trail.js`)
- Comprehensive audit logging
- Correlation ID tracking
- Audit query capabilities
- Buffer management with auto-flush
#### Correlation ID (`utils/correlation-id.js`)
- Unique ID generation
- Timestamp extraction
- Validation and verification
#### Error Handler (`utils/error-handler.js`)
- Centralized error creation
- Error code management
- Error formatting
- Retry determination
#### Rate Limiter (`utils/rate-limiter.js`)
- Distributed rate limiting with Cloudflare KV
- Per-client rate tracking
- Configurable windows and thresholds
- Graceful degradation
#### Retry Handler (`utils/retry-handler.js`)
- Exponential backoff retry logic
- Configurable retry policies
- Jitter to prevent thundering herd
- Retryable error detection
## Data Flow
### Typical Read Request Flow
```
1. Client Request
↓
2. Entry Point (parse request, generate correlation ID)
↓
3. Rate Limiter (check rate limits)
↓
4. Token Handler (validate token)
↓
5. Permission Checker (verify read access)
↓
6. Schema Validator (validate params)
↓
7. Tool Executor (execute read operation)
↓
8. Audit Trail (log operation)
↓
9. Response Formatting (send response)
```
### Typical Write Request Flow
```
1. Client Request
↓
2. Entry Point (parse request, generate correlation ID)
↓
3. Rate Limiter (check rate limits)
↓
4. Token Handler (validate token)
↓
5. Permission Checker (verify write access)
↓
6. Schema Validator (validate params)
↓
7. Security Scanner (scan for secrets/vulnerabilities)
↓
8. Policy Enforcer (check authorization policies)
↓
9. Dependency Checker (check for known vulnerabilities)
↓
10. Tool Executor (execute write operation)
↓
11. Audit Trail (log operation)
↓
12. Response Formatting (send response)
```
## Security Architecture
### Multi-Layer Security
1. **Authentication Layer**
- JWT token validation with signature verification
- GitHub token validation via GitHub API
- Token expiration handling
2. **Authorization Layer**
- Repository-level access control
- Branch protection rules
- User-based permissions
- Organization permissions
3. **Validation Layer**
- Schema validation for all inputs
- Type checking and constraint enforcement
- Detailed error reporting
4. **Security Scanning**
- Secret detection (30+ patterns)
- High-entropy string detection
- Suspicious code pattern detection
- Known vulnerability signatures
5. **Policy Enforcement**
- Repository pattern matching (allow/block)
- Branch pattern matching (allow/block)
- File path restrictions
- Extension whitelisting
- Time-based restrictions
6. **Rate Limiting**
- Per-client rate limiting
- Distributed KV store
- Configurable thresholds
- Graceful degradation
### Secret Detection
The security scanner implements multi-layered secret detection:
1. **Regex-Based Detection**
- 30+ predefined secret patterns
- Covers major service providers
- Extensible via configuration
2. **Entropy Analysis**
- Shannon entropy calculation
- Threshold-based detection
- Reduces false positives
3. **Contextual Analysis**
- Common pattern filtering
- URL and path detection
- Email and UUID detection
## Scalability
### Stateless Design
- No in-memory state
- Each request is independent
- Horizontal scaling via Cloudflare Workers
- Automatic load balancing
### Distributed Components
- **KV Store**: Distributed rate limiting
- **Cache Layers**: Token, permission, and validation caching
- **CDN**: Global edge caching for static resources
### Performance Optimization
- Connection pooling (automatic with fetch)
- Response caching where appropriate
- Lazy loading of resources
- Efficient batch processing
## Observability
### Logging
- Structured JSON logging
- Correlation ID for request tracing
- Log level filtering (debug, info, warn, error)
- Environment-aware configuration
### Audit Trail
- Complete operation audit
- User and operation tracking
- Success/failure status
- Queryable by correlation ID, user, action, or time
### Metrics
- Request timing
- Error rates
- Rate limit enforcement
- Cache hit rates
## Error Handling
### Error Categories
1. **Validation Errors**: Invalid input parameters
2. **Authentication Errors**: Token validation failures
3. **Authorization Errors**: Permission denials
4. **Security Errors**: Secret/vulnerability detection
5. **Policy Errors**: Policy violations
6. **Rate Limit Errors**: Exceeding rate limits
7. **Internal Errors**: Unexpected system errors
### Error Response Format
```json
{
"error": {
"code": "ERROR_CODE",
"message": "Human-readable error message",
"data": {
// Additional error details
},
"correlationId": "request-correlation-id"
}
}
```
## Deployment Architecture
### Cloudflare Workers
- **Runtime**: V8 isolates
- **Execution Limit**: 10ms CPU time, 128MB memory
- **Request Limit**: 50,000 per day (free tier)
- **Zero Idle Cost**: Pay only for requests
### GitHub Actions Integration
- **Delegation Pattern**: Heavy tasks delegated to GitHub Actions
- **Workflow Dispatch**: Triggered via GitHub API
- **Status Updates**: Real-time status tracking
- **Log Access**: Fetch execution logs
### CI/CD Pipeline
- **GitHub Actions**: Automated testing and deployment
- **Multi-Stage**: Lint → Test → Build → Deploy
- **Environment Separation**: Staging and Production
- **Rollback Support**: Quick rollback capability
## Future Considerations
### Potential Enhancements
1. **WebSocket Support**: Real-time status updates
2. **Webhook Handling**: GitHub webhook integration
3. **Advanced Caching**: Multi-tier caching strategy
4. **GraphQL Support**: GitHub GraphQL API integration
5. **Multi-Provider**: Support for other Git providers
6. **Advanced Policies**: More sophisticated policy engine
7. **ML-Based Detection**: AI-powered secret detection
8. **Real-time Monitoring**: Dashboard and alerting
### Scalability Improvements
1. **Request Queuing**: Queue-based processing for heavy operations
2. **Worker Pools**: Specialized workers for specific tasks
3. **Edge Computing**: More processing at the edge
4. **Database Integration**: Persistent storage for audit logs