Skip to main content
Glama
arch-template.md53.9 kB
# Software Architecture Design: [PRODUCT/PROJECT NAME] **Version**: 1.0 | **Date**: [DATE] | **Status**: Draft **Prepared by**: [ARCHITECT/TEAM] | **Reviewed by**: [STAKEHOLDER] **Note**: This template is used to create the architecture design document for the entire product/system. This document should be created after completing one or more feature specifications (`/rainbow.specify`) and ground rules review (`/rainbow.regulate`), but before creating implementation plans (`/rainbow.design`) for individual features. --- ## Document Control | Version | Date | Author | Changes | |---------|------|--------|---------| | 1.0 | [DATE] | [NAME] | Initial architecture design | | | | | | **Related Documents**: - Ground Rules: `memory/ground-rules.md` - Feature Specifications: `specs/[###-feature]/spec.md` - Implementation Plans: `specs/[###-feature]/design.md` (created after this document) --- ## Executive Summary <!-- ACTION REQUIRED: Provide a concise overview (2-3 paragraphs) of the system architecture. This should be understandable by non-technical stakeholders. --> **Purpose**: [Brief description of what the system does and why it exists] **Scope**: [What is included/excluded from this architecture] **Key Architectural Decisions**: - [Decision 1: e.g., "Microservices architecture chosen for scalability"] - [Decision 2: e.g., "Event-driven communication for loose coupling"] - [Decision 3: e.g., "Cloud-native deployment on AWS/Azure/GCP"] **Target Audience**: Software architects, developers, DevOps engineers, technical project managers, and stakeholders involved in system planning and implementation. --- ## Table of Contents 1. [Introduction](#1-introduction) 2. [Architectural Drivers](#2-architectural-drivers) 3. [System Context View](#3-system-context-view) 4. [Container View](#4-container-view) 5. [Component View](#5-component-view) 6. [Code View](#6-code-view) 7. [Deployment View](#7-deployment-view) 8. [Architecture Decisions](#8-architecture-decisions) 9. [Quality Attributes](#9-quality-attributes) 10. [Risks & Technical Debt](#10-risks--technical-debt) 11. [Appendices](#11-appendices) --- ## 1. Introduction ### 1.1 Purpose <!-- Describe the purpose of this architecture document and what it aims to achieve. --> This document describes the software architecture for [PRODUCT/PROJECT NAME]. It provides a comprehensive view of the system's structure, components, and design decisions to guide development teams and stakeholders. ### 1.2 Scope <!-- Define what is covered by this architecture and what is explicitly out of scope. --> **In Scope**: - [System boundary 1: e.g., "Web application and API backend"] - [System boundary 2: e.g., "Data persistence and caching layers"] - [System boundary 3: e.g., "Integration with third-party services"] **Out of Scope**: - [Exclusion 1: e.g., "Legacy system migration details"] - [Exclusion 2: e.g., "Third-party service internal architectures"] - [Exclusion 3: e.g., "Network infrastructure below application layer"] ### 1.3 Architectural Approach <!-- Describe the overall architectural style/pattern being used. --> This architecture follows [ARCHITECTURAL STYLE: e.g., "a layered, microservices-based approach" or "a modular monolith pattern" or "an event-driven architecture"]. **Key Principles**: 1. [Principle 1: e.g., "Separation of concerns"] 2. [Principle 2: e.g., "Loose coupling, high cohesion"] 3. [Principle 3: e.g., "Design for failure and resilience"] 4. [Principle 4: e.g., "Security by design"] 5. [Principle 5: e.g., "Optimize for change"] ### 1.4 Definitions, Acronyms, and Abbreviations | Term | Definition | |------|------------| | API | Application Programming Interface | | SLA | Service Level Agreement | | RBAC | Role-Based Access Control | | [ADD MORE] | [Specific to your system] | --- ## 2. Architectural Drivers ### 2.1 Stakeholders and Concerns <!-- Identify key stakeholders and their primary concerns that influence the architecture. --> | Stakeholder | Role | Key Concerns | |-------------|------|--------------| | [End Users] | System users | Performance, usability, reliability | | [Product Owner] | Business representative | Time to market, cost, feature completeness | | [Development Team] | Implementation | Maintainability, testability, clear interfaces | | [Operations Team] | Deployment & monitoring | Deployability, observability, scalability | | [Security Team] | Security compliance | Data protection, authentication, authorization | | [Legal/Compliance] | Regulatory compliance | Data privacy (GDPR, CCPA), audit trails | ### 2.2 Business Goals <!-- List the key business objectives that drive architectural decisions. --> 1. **BG-001**: [Business goal 1, e.g., "Reduce customer onboarding time from 10 minutes to 2 minutes"] 2. **BG-002**: [Business goal 2, e.g., "Support 100k concurrent users by Q4"] 3. **BG-003**: [Business goal 3, e.g., "Enable multi-region deployment for compliance"] 4. **BG-004**: [Business goal 4, e.g., "Reduce operational costs by 30% through automation"] ### 2.3 Quality Attribute Requirements <!-- Define the quality attributes (non-functional requirements) that shape the architecture. Use measurable criteria where possible. --> #### 2.3.1 Performance - **PE-001**: System MUST respond to 95% of user requests within [X]ms under normal load - **PE-002**: System MUST support [X] transactions per second (TPS) - **PE-003**: Database queries MUST complete within [X]ms for 99th percentile #### 2.3.2 Scalability - **SC-001**: System MUST scale horizontally to support [X] concurrent users - **SC-002**: System MUST handle [X]% increase in load without architecture changes - **SC-003**: Storage MUST scale to [X]TB of data #### 2.3.3 Availability & Reliability - **AV-001**: System MUST maintain [X]% uptime (e.g., 99.9% = ~8.76 hours downtime/year) - **AV-002**: System MUST recover from failures within [X] minutes (RTO) - **AV-003**: System MUST tolerate single point of failure in [critical components] #### 2.3.4 Security - **SE-001**: System MUST authenticate users using [method: e.g., OAuth 2.0, SAML] - **SE-002**: System MUST encrypt data in transit (TLS 1.3+) and at rest (AES-256) - **SE-003**: System MUST implement RBAC with [roles: admin, user, guest] - **SE-004**: System MUST comply with [standards: OWASP Top 10, SOC 2, ISO 27001] #### 2.3.5 Maintainability - **MA-001**: Code coverage MUST exceed [X]% for critical components - **MA-002**: System MUST support zero-downtime deployments - **MA-003**: New features MUST be developable within [timeframe] by [team size] #### 2.3.6 Usability - **US-001**: User interface MUST support [accessibility standards: WCAG 2.1 AA] - **US-002**: System MUST support [languages/locales] - **US-003**: Mobile responsive design MUST support [screen sizes] #### 2.3.7 Observability - **OB-001**: System MUST provide centralized logging with [retention period] - **OB-002**: System MUST expose metrics for monitoring (latency, throughput, errors) - **OB-003**: System MUST support distributed tracing for troubleshooting ### 2.4 Constraints <!-- Document technical, organizational, and business constraints that limit design choices. --> **Technical Constraints**: - **TC-001**: [e.g., "Must use existing PostgreSQL database infrastructure"] - **TC-002**: [e.g., "Must integrate with legacy SOAP-based authentication system"] - **TC-003**: [e.g., "Must run on Kubernetes 1.25+"] **Organizational Constraints**: - **OC-001**: [e.g., "Development team limited to 5 engineers"] - **OC-002**: [e.g., "Must use company-approved AWS services only"] - **OC-003**: [e.g., "Must follow internal security review process"] **Business Constraints**: - **BC-001**: [e.g., "Budget limited to $X for infrastructure"] - **BC-002**: [e.g., "Must launch MVP within 6 months"] - **BC-003**: [e.g., "Cannot require users to install native apps"] ### 2.5 Assumptions and Dependencies **Assumptions**: - **AS-001**: [e.g., "Users have broadband internet connectivity (5+ Mbps)"] - **AS-002**: [e.g., "Third-party payment API has 99.9% availability"] - **AS-003**: [e.g., "Database can handle projected growth for 2 years"] **Dependencies**: - **DE-001**: [e.g., "Depends on Stripe API for payment processing"] - **DE-002**: [e.g., "Depends on SendGrid for email delivery"] - **DE-003**: [e.g., "Depends on Auth0 for identity management"] --- ## 3. System Context View <!-- C4 Model Level 1: System Context Diagram Shows the system as a box in the center, surrounded by users and other systems. This is the "zoom out" view - the big picture. --> ### 3.1 Context Diagram ```text ┌─────────────────────────────────────────────────────────────────┐ │ System Context │ │ │ │ ┌──────────┐ │ │ │ End │──────────────┐ │ │ │ Users │ │ │ │ └──────────┘ │ │ │ │ │ │ ┌──────────┐ │ ┌─────────────────────┐ │ │ │ Admin │──────────────┼────▶│ │ │ │ │ Users │ │ │ [SYSTEM NAME] │ │ │ └──────────┘ │ │ │ │ │ └────▶│ [Brief description]│ │ │ ┌──────────┐ │ │ │ │ │ External │ └──────────┬──────────┘ │ │ │ System 1 │◀──────────────────────────────┘ │ │ └──────────┘ │ │ │ │ ┌──────────┐ ┌─────────────────┐ │ │ │ External │◀─────────────│ [SYSTEM NAME] │ │ │ │ System 2 │ └─────────────────┘ │ │ └──────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` <!-- ACTION REQUIRED: Replace the diagram above with an actual diagram. RECOMMENDED: Use Mermaid diagram for version control and maintainability: ```mermaid graph TB Users[End Users] --> System[SYSTEM NAME] Admins[Admin Users] --> System System --> ExtSystem1[External System 1] System --> ExtSystem2[External System 2] ``` Alternatives: ASCII art (shown above), C4-PlantUML, or Draw.io/Lucidchart (link to image) --> ### 3.2 Context Description **System**: [SYSTEM NAME] [1-2 paragraph description of what the system does from a high level] **Users and External Systems**: | Actor/System | Type | Description | Interaction | |--------------|------|-------------|-------------| | [End Users] | User | [Who they are] | [How they use the system] | | [Admin Users] | User | [Who they are] | [How they use the system] | | [External System 1] | External System | [What it is] | [What data/services exchanged] | | [External System 2] | External System | [What it is] | [What data/services exchanged] | ### 3.3 System Responsibilities **What the system does**: - [Responsibility 1: e.g., "Manages user authentication and authorization"] - [Responsibility 2: e.g., "Processes and stores customer orders"] - [Responsibility 3: e.g., "Generates reports and analytics"] **What the system does NOT do** (delegated to external systems): - [External responsibility 1: e.g., "Payment processing (handled by Stripe)"] - [External responsibility 2: e.g., "Email delivery (handled by SendGrid)"] - [External responsibility 3: e.g., "File storage (handled by AWS S3)"] --- ## 4. Container View <!-- C4 Model Level 2: Container Diagram Shows the high-level technical building blocks (containers) of the system. Containers = applications, data stores, microservices, etc. --> ### 4.1 Container Diagram ```text ┌──────────────────────────────────────────────────────────────────┐ │ Container View │ │ │ │ ┌──────────┐ │ │ │ Users │ │ │ └────┬─────┘ │ │ │ │ │ │ HTTPS │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ Web Application │ │ │ │ [React/Vue/Angular]│ │ │ │ Port: 443 │ │ │ └──────────┬──────────┘ │ │ │ │ │ │ REST/GraphQL │ │ ▼ │ │ ┌─────────────────────┐ ┌──────────────────┐ │ │ │ API Application │────────▶│ Database │ │ │ │ [FastAPI/Express] │ │ [PostgreSQL] │ │ │ │ Port: 8000 │ │ Port: 5432 │ │ │ └──────────┬──────────┘ └──────────────────┘ │ │ │ │ │ │ ┌──────────────────┐ │ │ └───────────────────▶│ Cache │ │ │ │ [Redis] │ │ │ │ Port: 6379 │ │ │ └──────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────┘ ``` <!-- ACTION REQUIRED: Replace the diagram above with your actual container architecture. RECOMMENDED: Use Mermaid diagram for clarity and version control: ```mermaid graph TB Users[Users] -->|HTTPS| WebApp[Web Application<br/>React/Vue/Angular<br/>Port: 443] WebApp -->|REST/GraphQL| API[API Application<br/>FastAPI/Express<br/>Port: 8000] API -->|SQL| DB[(Database<br/>PostgreSQL<br/>Port: 5432)] API -->|Redis Protocol| Cache[(Cache<br/>Redis<br/>Port: 6379)] ``` Include all applications, databases, message queues, caches, etc. --> ### 4.2 Container Descriptions #### Container 1: [Web Application] - **Technology**: [e.g., React 18, TypeScript, Vite] - **Purpose**: [e.g., "Provides the user interface for end users"] - **Responsibilities**: - [Responsibility 1: e.g., "Renders UI components"] - [Responsibility 2: e.g., "Handles client-side routing"] - [Responsibility 3: e.g., "Manages user session state"] - **Communication**: - Calls [API Application] via [HTTPS REST API] - Authenticates with [Auth Service] via [OAuth 2.0] #### Container 2: [API Application] - **Technology**: [e.g., FastAPI 0.104, Python 3.11, Uvicorn] - **Purpose**: [e.g., "Provides REST API for business logic"] - **Responsibilities**: - [Responsibility 1: e.g., "Handles business logic and validation"] - [Responsibility 2: e.g., "Orchestrates data access"] - [Responsibility 3: e.g., "Enforces authorization rules"] - **Communication**: - Reads/writes to [Database] via [PostgreSQL driver] - Publishes events to [Message Queue] via [RabbitMQ client] - Caches data in [Redis] via [Redis client] #### Container 3: [Database] - **Technology**: [e.g., PostgreSQL 15] - **Purpose**: [e.g., "Stores persistent application data"] - **Responsibilities**: - [Responsibility 1: e.g., "Stores user accounts and profiles"] - [Responsibility 2: e.g., "Stores transactional data"] - [Responsibility 3: e.g., "Maintains referential integrity"] - **Schema Overview**: [Link to schema diagram or list key tables] - Tables: [users, orders, products, etc.] #### Container 4: [Cache] - **Technology**: [e.g., Redis 7] - **Purpose**: [e.g., "Provides fast access to frequently accessed data"] - **Responsibilities**: - [Responsibility 1: e.g., "Caches user sessions"] - [Responsibility 2: e.g., "Caches API responses for 5 minutes"] - [Responsibility 3: e.g., "Implements rate limiting counters"] [Add more containers as needed] ### 4.3 Inter-Container Communication | From Container | To Container | Protocol | Purpose | Data Format | |----------------|--------------|----------|---------|-------------| | [Web App] | [API App] | HTTPS | API calls | JSON | | [API App] | [Database] | TCP | Data persistence | SQL | | [API App] | [Cache] | TCP | Caching | Redis protocol | | [API App] | [Message Queue] | AMQP | Event publishing | JSON | --- ## 5. Component View <!-- C4 Model Level 3: Component Diagram Shows the internal structure of a container - the main components/modules. Create one component diagram for each critical container. --> ### 5.1 [Container Name] Component Diagram ```text ┌─────────────────────────────────────────────────────────────────┐ │ [Container Name] - Component View │ │ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ Presentation Layer │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Controller │ │ Controller │ │ Controller │ │ │ │ │ │ 1 │ │ 2 │ │ 3 │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ └─────────┼──────────────────┼──────────────────┼─────────┘ │ │ │ │ │ │ │ ┌─────────┼──────────────────┼──────────────────┼─────────┐ │ │ │ ▼ ▼ ▼ │ │ │ │ Business Logic Layer │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Service │ │ Service │ │ Service │ │ │ │ │ │ 1 │ │ 2 │ │ 3 │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ └─────────┼──────────────────┼──────────────────┼─────────┘ │ │ │ │ │ │ │ ┌─────────┼──────────────────┼──────────────────┼─────────┐ │ │ │ ▼ ▼ ▼ │ │ │ │ Data Access Layer │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Repository │ │ Repository │ │ Repository │ │ │ │ │ │ 1 │ │ 2 │ │ 3 │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` <!-- ACTION REQUIRED: Replace with your actual component structure. RECOMMENDED: Use Mermaid diagram to show component layers and dependencies: ```mermaid graph TB subgraph "Presentation Layer" C1[Controller 1] C2[Controller 2] C3[Controller 3] end subgraph "Business Logic Layer" S1[Service 1] S2[Service 2] S3[Service 3] end subgraph "Data Access Layer" R1[Repository 1] R2[Repository 2] R3[Repository 3] end C1 --> S1 C2 --> S2 C3 --> S3 S1 --> R1 S2 --> R2 S3 --> R3 ``` Show the major components and their relationships. --> ### 5.2 Component Descriptions #### Component Group: [Presentation Layer / Controllers] ##### Component: [UserController] - **Purpose**: [e.g., "Handles HTTP requests related to user management"] - **Responsibilities**: - [Responsibility 1: e.g., "Validates incoming request data"] - [Responsibility 2: e.g., "Maps HTTP requests to service calls"] - [Responsibility 3: e.g., "Returns HTTP responses with proper status codes"] - **Dependencies**: [UserService, AuthService] - **Exposed Interfaces**: - `POST /api/users` - Create user - `GET /api/users/{id}` - Get user by ID - `PUT /api/users/{id}` - Update user - `DELETE /api/users/{id}` - Delete user [Add more components as needed] #### Component Group: [Business Logic Layer / Services] ##### Component: [UserService] - **Purpose**: [e.g., "Encapsulates business logic for user operations"] - **Responsibilities**: - [Responsibility 1: e.g., "Enforces business rules for user creation"] - [Responsibility 2: e.g., "Validates email uniqueness"] - [Responsibility 3: e.g., "Publishes user events to message queue"] - **Dependencies**: [UserRepository, EmailService, EventBus] - **Key Operations**: - `createUser(userData): User` - `getUserById(userId): User` - `updateUser(userId, userData): User` - `deleteUser(userId): void` [Add more components as needed] #### Component Group: [Data Access Layer / Repositories] ##### Component: [UserRepository] - **Purpose**: [e.g., "Provides data access abstraction for user data"] - **Responsibilities**: - [Responsibility 1: e.g., "Executes SQL queries for user data"] - [Responsibility 2: e.g., "Maps database rows to domain objects"] - [Responsibility 3: e.g., "Handles database transactions"] - **Dependencies**: [Database Connection Pool] - **Key Operations**: - `save(user): void` - `findById(userId): User | null` - `findByEmail(email): User | null` - `delete(userId): void` [Add more components as needed] ### 5.3 Component Interaction Patterns **Pattern 1: [Request-Response]** - [Description: e.g., "Synchronous HTTP request/response for API calls"] - [When used: e.g., "User-initiated actions requiring immediate response"] **Pattern 2: [Event-Driven]** - [Description: e.g., "Asynchronous event publishing for decoupled operations"] - [When used: e.g., "Background processing, notifications, audit logging"] **Pattern 3: [Repository Pattern]** - [Description: e.g., "Abstraction over data access logic"] - [When used: e.g., "All database operations"] --- ## 6. Code View <!-- C4 Model Level 4: Code Diagram (optional, usually in IDE) Shows the internal structure of a component at the class/function level. This section is usually omitted from architecture docs as it's visible in code. Include only for critical/complex components that need explanation. --> ### 6.1 Code Organization **Directory Structure**: ```text project-root/ ├── src/ │ ├── api/ # API/presentation layer │ │ ├── controllers/ # HTTP request handlers │ │ ├── middleware/ # Request/response middleware │ │ └── routes/ # Route definitions │ ├── business/ # Business logic layer │ │ ├── services/ # Business services │ │ ├── models/ # Domain models │ │ └── validators/ # Business rule validators │ ├── data/ # Data access layer │ │ ├── repositories/ # Data access abstractions │ │ ├── entities/ # Database entities/ORM models │ │ └── migrations/ # Database migration scripts │ ├── infrastructure/ # Infrastructure concerns │ │ ├── config/ # Configuration management │ │ ├── logging/ # Logging utilities │ │ ├── messaging/ # Message queue clients │ │ └── cache/ # Caching utilities │ └── shared/ # Shared utilities │ ├── types/ # Shared type definitions │ ├── utils/ # Utility functions │ └── constants/ # Application constants ├── tests/ │ ├── unit/ # Unit tests │ ├── integration/ # Integration tests │ └── e2e/ # End-to-end tests ├── docs/ # Documentation └── scripts/ # Build/deployment scripts ``` ### 6.2 Naming Conventions **Files**: - [Convention 1: e.g., "PascalCase for classes: `UserService.ts`"] - [Convention 2: e.g., "kebab-case for components: `user-profile.tsx`"] - [Convention 3: e.g., "camelCase for utilities: `formatDate.ts`"] **Code**: - [Convention 1: e.g., "PascalCase for classes/interfaces: `class UserService`"] - [Convention 2: e.g., "camelCase for functions/variables: `function getUserById()`"] - [Convention 3: e.g., "UPPER_SNAKE_CASE for constants: `const MAX_RETRY_COUNT`"] ### 6.3 Key Design Patterns | Pattern | Purpose | Usage Example | |---------|---------|---------------| | [Repository] | Data access abstraction | `UserRepository`, `OrderRepository` | | [Service Layer] | Business logic encapsulation | `UserService`, `OrderService` | | [Factory] | Object creation | `DatabaseConnectionFactory` | | [Singleton] | Single instance | `ConfigurationManager` | | [Strategy] | Runtime algorithm selection | `PaymentStrategy` (Credit, Debit, PayPal) | | [Observer] | Event notification | `EventEmitter` for domain events | --- ## 7. Deployment View <!-- Shows how containers are deployed to infrastructure. Include production and key non-production environments. --> ### 7.1 Deployment Architecture #### 7.1.1 Production Environment ```text ┌─────────────────────────────────────────────────────────────────┐ │ Production Deployment │ │ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ Region: [us-east-1] │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ │ │ Availability Zone 1 │ │ │ │ │ │ ┌───────────┐ ┌───────────┐ ┌──────────┐ │ │ │ │ │ │ │ Web │ │ API │ │ DB │ │ │ │ │ │ │ │ Server │ │ Server │ │ Primary │ │ │ │ │ │ │ └───────────┘ └───────────┘ └──────────┘ │ │ │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ │ │ Availability Zone 2 │ │ │ │ │ │ ┌───────────┐ ┌───────────┐ ┌──────────┐ │ │ │ │ │ │ │ Web │ │ API │ │ DB │ │ │ │ │ │ │ │ Server │ │ Server │ │ Replica │ │ │ │ │ │ │ └───────────┘ └───────────┘ └──────────┘ │ │ │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ │ │ Shared Services │ │ │ │ │ │ ┌───────────┐ ┌───────────┐ ┌──────────┐ │ │ │ │ │ │ │ Redis │ │ Message │ │ S3 │ │ │ │ │ │ │ │ Cache │ │ Queue │ │ Storage │ │ │ │ │ │ │ └───────────┘ └───────────┘ └──────────┘ │ │ │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` <!-- ACTION REQUIRED: Replace with your actual deployment architecture. RECOMMENDED: Use Mermaid diagram for infrastructure visualization: ```mermaid graph TB subgraph "Production Region: us-east-1" subgraph "Availability Zone 1" Web1[Web Server] API1[API Server] DB1[(DB Primary)] end subgraph "Availability Zone 2" Web2[Web Server] API2[API Server] DB2[(DB Replica)] end subgraph "Shared Services" Redis[(Redis Cache)] MQ[Message Queue] S3[(S3 Storage)] end end ``` Include load balancers, CDNs, multi-region setup if applicable. --> #### 7.1.2 Environment Specifications | Environment | Purpose | Infrastructure | Scaling | |-------------|---------|----------------|---------| | **Production** | Live system | [e.g., AWS EKS, 3 nodes, t3.xlarge] | Auto-scale 3-10 pods | | **Staging** | Pre-production testing | [e.g., AWS EKS, 2 nodes, t3.large] | Fixed 2 pods | | **Development** | Developer testing | [e.g., Docker Compose on local] | Single instance | | **QA** | Quality assurance | [e.g., AWS EKS, 1 node, t3.medium] | Fixed 1 pod | ### 7.2 Infrastructure Components #### 7.2.1 Compute | Component | Technology | Configuration | Purpose | |-----------|------------|---------------|---------| | [Web Server] | [e.g., Nginx on Kubernetes] | [e.g., 2 vCPU, 4GB RAM] | Serve static content, reverse proxy | | [API Server] | [e.g., Uvicorn on Kubernetes] | [e.g., 4 vCPU, 8GB RAM] | Run API application | | [Worker] | [e.g., Celery on Kubernetes] | [e.g., 2 vCPU, 4GB RAM] | Background job processing | #### 7.2.2 Data Storage | Component | Technology | Configuration | Purpose | |-----------|------------|---------------|---------| | [Primary DB] | [e.g., RDS PostgreSQL 15] | [e.g., db.r6g.xlarge, 500GB SSD] | Transactional data | | [Cache] | [e.g., ElastiCache Redis 7] | [e.g., cache.r6g.large, 6GB] | Session/data caching | | [Object Storage] | [e.g., S3 Standard] | [e.g., 1TB, lifecycle to Glacier] | File storage | | [Search] | [e.g., OpenSearch] | [e.g., 3 data nodes, 4GB each] | Full-text search | #### 7.2.3 Networking | Component | Technology | Configuration | Purpose | |-----------|------------|---------------|---------| | [Load Balancer] | [e.g., AWS ALB] | [e.g., Multi-AZ, SSL termination] | Distribute traffic | | [CDN] | [e.g., CloudFront] | [e.g., Global edge locations] | Static content delivery | | [DNS] | [e.g., Route 53] | [e.g., Latency-based routing] | Domain resolution | | [API Gateway] | [e.g., Kong/AWS API Gateway] | [e.g., Rate limiting, auth] | API management | ### 7.3 Deployment Pipeline **CI/CD Workflow**: ```mermaid graph TB A[Code Push] --> B[Build & Test] B --> C[Unit Tests] C --> D[Integration Tests] D --> E[Security Scan] E --> F[SAST] E --> G[Dependency Scan] F --> H[Build Image] G --> H H --> I[Docker Build] I --> J[Push to ECR] J --> K[Deploy to Staging] K --> L[Helm Upgrade] L --> M[Smoke Tests] M --> N[Manual Approval] N --> O[Product Owner Review] O --> P[Deploy to Prod] P --> Q[Blue/Green Deployment] Q --> R[Post-Deploy] R --> S[Health Checks] R --> T[Monitoring] ``` **Deployment Strategy**: [e.g., "Blue/Green deployment with automatic rollback on health check failure"] ### 7.4 Infrastructure as Code - **Tool**: [e.g., Terraform 1.5+] - **Repository**: [e.g., `infrastructure/terraform/`] - **State Management**: [e.g., "S3 backend with DynamoDB state locking"] - **Key Modules**: - [Module 1: e.g., "VPC and networking"] - [Module 2: e.g., "EKS cluster"] - [Module 3: e.g., "RDS database"] - [Module 4: e.g., "IAM roles and policies"] ### 7.5 Disaster Recovery **Backup Strategy**: - **Database**: [e.g., "Automated daily snapshots, retained for 30 days, backed up to S3"] - **Files**: [e.g., "S3 with versioning enabled, cross-region replication to us-west-2"] - **Configuration**: [e.g., "Infrastructure code in Git, secrets in AWS Secrets Manager"] **Recovery Objectives**: - **RTO (Recovery Time Objective)**: [e.g., "4 hours for full system restoration"] - **RPO (Recovery Point Objective)**: [e.g., "15 minutes of data loss maximum"] **Disaster Recovery Plan**: 1. [Step 1: e.g., "Detect failure via monitoring alerts"] 2. [Step 2: e.g., "Initiate incident response process"] 3. [Step 3: e.g., "Restore database from latest snapshot"] 4. [Step 4: e.g., "Redeploy application to alternate region"] 5. [Step 5: e.g., "Update DNS to point to new region"] 6. [Step 6: e.g., "Validate system functionality"] --- ## 8. Architecture Decisions <!-- Document key architectural decisions using Architecture Decision Records (ADR) format. Each decision should explain: Context, Decision, Consequences, and Alternatives. --> ### 8.1 Decision Log | ID | Date | Title | Status | |----|------|-------|--------| | ADR-001 | [DATE] | [Decision title] | [Accepted/Superseded/Deprecated] | | ADR-002 | [DATE] | [Decision title] | [Accepted] | | ADR-003 | [DATE] | [Decision title] | [Proposed] | ### 8.2 Architecture Decision Records #### ADR-001: [Decision Title] **Date**: [DATE] **Status**: [Proposed | Accepted | Deprecated | Superseded by ADR-XXX] **Deciders**: [List of people involved] **Context**: [Describe the issue or situation that prompted this decision. What forces are at play?] **Decision**: [Describe the decision and its implications. Be specific and actionable.] **Rationale**: [Explain why this decision was made. What factors influenced it?] **Consequences**: *Positive*: - [Benefit 1] - [Benefit 2] - [Benefit 3] *Negative*: - [Trade-off 1] - [Trade-off 2] - [Trade-off 3] *Neutral*: - [Impact 1] - [Impact 2] **Alternatives Considered**: 1. **Alternative 1**: [Description] - Pros: [List pros] - Cons: [List cons] - Why rejected: [Reason] 2. **Alternative 2**: [Description] - Pros: [List pros] - Cons: [List cons] - Why rejected: [Reason] **Related Decisions**: - [ADR-XXX]: [Related decision] --- #### ADR-002: [Example: Use Microservices Architecture] **Date**: 2024-01-15 **Status**: Accepted **Deciders**: Architecture Team, CTO **Context**: The system needs to scale independently by function (user management, order processing, payments) and be developed by multiple teams concurrently. The existing monolith has become difficult to deploy and scale. **Decision**: Adopt a microservices architecture with independent services for user management, orders, payments, and notifications. Services communicate via REST APIs and asynchronous message queues. **Rationale**: - Teams can work independently with minimal coordination - Services can scale based on individual load patterns - Technology stack can be optimized per service - Failures are isolated to individual services **Consequences**: *Positive*: - Independent deployment and scaling per service - Team autonomy and faster development cycles - Technology flexibility per service - Better fault isolation *Negative*: - Increased operational complexity (more services to monitor) - Distributed system challenges (network latency, partial failures) - Data consistency across services requires careful design - Increased infrastructure costs *Neutral*: - Need for robust DevOps practices and tooling - API versioning and backward compatibility required **Alternatives Considered**: 1. **Modular Monolith**: Well-structured monolith with clear module boundaries - Pros: Simpler deployment, easier transactions, lower infrastructure cost - Cons: All modules share same technology stack, scaling requires scaling entire monolith - Why rejected: Doesn't solve team independence and scaling issues 2. **Serverless Functions**: AWS Lambda for each feature - Pros: Extreme scalability, pay-per-use pricing, no server management - Cons: Cold start latency, vendor lock-in, complex state management - Why rejected: Not suitable for long-running operations and complex workflows **Related Decisions**: - ADR-003: API Gateway for Service Communication - ADR-005: Event-Driven Architecture for Async Communication --- [Add more ADRs as needed for critical decisions] --- ## 9. Quality Attributes <!-- Detail how the architecture achieves the quality attribute requirements defined in Section 2.3. Provide specific implementation strategies. --> ### 9.1 Performance Strategies **Strategy 1: Caching** - **Implementation**: Redis cache for frequently accessed data (user sessions, product catalogs) - **Cache Invalidation**: Time-based expiration (5 minutes) + event-based invalidation - **Expected Impact**: Reduce database load by 70%, API response time from 200ms to 50ms **Strategy 2: Database Optimization** - **Implementation**: Read replicas for analytics queries, connection pooling (100 connections) - **Indexes**: On `user_id`, `email`, `order_date`, `product_id` - **Expected Impact**: Query performance <50ms for 99th percentile **Strategy 3: Asynchronous Processing** - **Implementation**: Message queue (RabbitMQ) for email notifications, report generation - **Expected Impact**: Immediate API responses, background processing up to 10k messages/sec ### 9.2 Scalability Strategies **Strategy 1: Horizontal Scaling** - **Implementation**: Stateless API servers with Kubernetes auto-scaling (3-10 pods) - **Trigger**: CPU >70% or requests >1000/sec - **Expected Impact**: Support 100k concurrent users **Strategy 2: Database Sharding** - **Implementation**: Shard user data by user_id (hash-based sharding) across 4 shards - **Expected Impact**: Linear scaling of database write capacity **Strategy 3: Content Delivery Network** - **Implementation**: CloudFront CDN for static assets (images, CSS, JS) - **Expected Impact**: Reduce origin load by 90%, improve global latency ### 9.3 Availability & Reliability Strategies **Strategy 1: Redundancy** - **Implementation**: Multi-AZ deployment, 3 API replicas, database with read replicas - **Expected Impact**: 99.9% availability (8.76 hours downtime/year max) **Strategy 2: Health Checks & Auto-Recovery** - **Implementation**: Kubernetes liveness/readiness probes, automatic pod restart - **Interval**: Health check every 10 seconds - **Expected Impact**: Failed containers restarted within 30 seconds **Strategy 3: Circuit Breakers** - **Implementation**: Resilience4j circuit breaker for external API calls - **Threshold**: Open after 50% failure rate over 10 requests - **Expected Impact**: Prevent cascading failures, graceful degradation ### 9.4 Security Strategies **Strategy 1: Authentication & Authorization** - **Implementation**: OAuth 2.0 with JWT tokens, RBAC with roles (admin, user, guest) - **Token Expiry**: Access token 15 minutes, refresh token 7 days - **Expected Impact**: Secure user authentication, fine-grained access control **Strategy 2: Encryption** - **Implementation**: TLS 1.3 for data in transit, AES-256 for data at rest - **Certificate Management**: AWS Certificate Manager, auto-renewal - **Expected Impact**: Protection against eavesdropping and data breaches **Strategy 3: Input Validation & Output Encoding** - **Implementation**: Pydantic models for input validation, parameterized queries - **Expected Impact**: Protection against SQL injection, XSS attacks **Strategy 4: Security Scanning** - **Implementation**: SAST (SonarQube), DAST (OWASP ZAP), dependency scanning (Snyk) - **Frequency**: On every commit (SAST), weekly (DAST) - **Expected Impact**: Early detection of security vulnerabilities ### 9.5 Maintainability Strategies **Strategy 1: Automated Testing** - **Implementation**: Unit tests (pytest), integration tests, E2E tests (Playwright) - **Coverage Target**: >80% for critical paths - **Expected Impact**: Catch regressions early, enable confident refactoring **Strategy 2: Observability** - **Implementation**: Structured logging (JSON), metrics (Prometheus), tracing (Jaeger) - **Log Retention**: 30 days in Elasticsearch - **Expected Impact**: Faster troubleshooting, proactive issue detection **Strategy 3: Documentation** - **Implementation**: OpenAPI specs for APIs, ADRs for decisions, inline code comments - **Automation**: Auto-generate API docs from code annotations - **Expected Impact**: Reduce onboarding time, improve knowledge sharing --- ## 10. Risks & Technical Debt ### 10.1 Architecture Risks | Risk ID | Description | Impact | Probability | Mitigation Strategy | |---------|-------------|--------|-------------|---------------------| | AR-001 | [Risk: e.g., "Database becomes bottleneck under high load"] | High | Medium | [Strategy: e.g., "Implement read replicas and caching"] | | AR-002 | [Risk: e.g., "Third-party API downtime affects core features"] | High | Low | [Strategy: e.g., "Circuit breakers + fallback mechanisms"] | | AR-003 | [Risk: e.g., "Vendor lock-in with AWS services"] | Medium | High | [Strategy: e.g., "Use abstraction layers, avoid proprietary APIs"] | | AR-004 | [Risk: e.g., "Security breach due to misconfigured access"] | Critical | Low | [Strategy: e.g., "Automated security audits, principle of least privilege"] | ### 10.2 Known Technical Debt | Debt ID | Description | Impact | Effort | Priority | Plan | |---------|-------------|--------|--------|----------|------| | TD-001 | [Debt: e.g., "Legacy authentication module not using modern standards"] | Medium | High | P2 | [Plan: e.g., "Refactor to OAuth 2.0 in Q3"] | | TD-002 | [Debt: e.g., "No automated backup testing"] | High | Medium | P1 | [Plan: e.g., "Implement quarterly DR drills"] | | TD-003 | [Debt: e.g., "Insufficient monitoring coverage for background jobs"] | Low | Low | P3 | [Plan: e.g., "Add metrics in next sprint"] | ### 10.3 Open Questions & Future Considerations **Open Questions**: 1. [Question 1: e.g., "How to handle eventual consistency in distributed transactions?"] - Status: Under investigation - Decision deadline: [DATE] 2. [Question 2: e.g., "Should we adopt GraphQL for more flexible client queries?"] - Status: Pending team discussion - Decision deadline: [DATE] **Future Enhancements**: - [Enhancement 1: e.g., "Multi-region active-active deployment for global users"] - Timeframe: Year 2 - Depends on: Business expansion to Asia/Europe - [Enhancement 2: e.g., "Machine learning-based recommendation engine"] - Timeframe: Year 2 - Depends on: Sufficient user data (1M+ users) - [Enhancement 3: e.g., "Mobile native apps for iOS and Android"] - Timeframe: Q4 2024 - Depends on: Budget approval --- ## 11. Appendices ### 11.1 Glossary | Term | Definition | |------|------------| | [Term 1] | [Definition] | | [Term 2] | [Definition] | | [Term 3] | [Definition] | ### 11.2 References 1. **Standards**: - IEEE 42010:2011 - Systems and software engineering — Architecture description - ISO/IEC 25010:2011 - Systems and software Quality Requirements and Evaluation (SQuaRE) 2. **Methodologies**: - C4 Model for visualizing software architecture: <https://c4model.com/> - Arc42 architecture template: <https://arc42.org/> - Architecture Decision Records (ADR): <https://adr.github.io/> 3. **Project References**: - [Reference 1: e.g., "Company Security Standards v2.0"] - [Reference 2: e.g., "API Design Guidelines"] - [Reference 3: e.g., "Infrastructure Best Practices"] ### 11.3 Related Documents - **Ground Rules**: `memory/ground-rules.md` - Project constraints and principles - **Feature Specifications**: `specs/*/spec.md` - Individual feature requirements - **Implementation Plans**: `specs/*/design.md` - Feature implementation strategies - **API Documentation**: [Link to OpenAPI/Swagger docs] - **Runbooks**: [Link to operational procedures] ### 11.4 Diagrams & Models <!-- List of all architecture diagrams and models with links/references. RECOMMENDED: Use Mermaid diagrams embedded in this document for maintainability. --> - **System Context Diagram**: [Section 3.1 or external tool link] - **Container Diagram**: [Section 4.1 or external tool link] - **Component Diagrams**: [Section 5.x or external tool link] - **Deployment Diagram**: [Section 7.1 or external tool link] - **Entity-Relationship Diagram**: [Link to database schema documentation or Mermaid ER diagram] - **Sequence Diagrams**: [Link to detailed interaction flows or Mermaid sequence diagrams] **Mermaid Diagram Examples**: For sequence diagrams showing interactions between components: ```mermaid sequenceDiagram participant User participant WebApp participant API participant Database User->>WebApp: Request data WebApp->>API: GET /api/data API->>Database: Query Database-->>API: Results API-->>WebApp: JSON response WebApp-->>User: Display data ``` For entity-relationship diagrams: ```mermaid erDiagram USER ||--o{ ORDER : places USER { int id PK string email string name } ORDER { int id PK int user_id FK datetime created_at } ``` ### 11.5 Approval & Review | Role | Name | Signature | Date | |------|------|-----------|------| | System Architect | [NAME] | | | | Lead Developer | [NAME] | | | | DevOps Lead | [NAME] | | | | Product Owner | [NAME] | | | | CTO/Technical Director | [NAME] | | | ### 11.6 Change History | Version | Date | Author | Summary of Changes | |---------|------|--------|--------------------| | 0.1 | [DATE] | [NAME] | Initial draft | | 0.2 | [DATE] | [NAME] | Added deployment view | | 1.0 | [DATE] | [NAME] | Final version for approval | --- **END OF ARCHITECTURE DESIGN DOCUMENT** --- ## Notes for Template Users ### When to Use This Template - **After** completing one or more feature specifications using `/rainbow.specify` - **After** reviewing project constraints using `/rainbow.regulate` - **Before** creating implementation plans using `/rainbow.design` - When you need to design the **overall system architecture**, not just a single feature ### How to Fill Out This Template 1. **Start with Section 2 (Architectural Drivers)**: Understand stakeholders, business goals, and quality requirements 2. **Work through C4 levels (Sections 3-6)**: Start high-level (context) and drill down (containers, components) 3. **Document deployment (Section 7)**: Define how the system runs in production 4. **Capture decisions (Section 8)**: Use ADRs for all significant architectural choices 5. **Map quality strategies (Section 9)**: Show how architecture achieves quality requirements 6. **Identify risks (Section 10)**: Be honest about limitations and technical debt ### Relationship to Other Documents - **Ground Rules** (`memory/ground-rules.md`): Defines project-wide constraints that this architecture must respect - **Feature Specs** (`specs/*/spec.md`): Define individual features; this architecture shows how they fit together - **Implementation Plans** (`specs/*/design.md`): Tactical plans for building features; guided by this architecture - **Tasks** (`specs/*/tasks.md`): Individual work items; derived from plans that follow this architecture ### Tips for Success - **Keep it visual**: Use diagrams extensively (C4, sequence diagrams, deployment diagrams) - **prefer Mermaid** for version control and easy updates - **Be specific**: Avoid generic statements; use actual technologies, numbers, and examples - **Document decisions**: Every significant choice should have an ADR explaining the rationale - **Think about stakeholders**: Different sections serve different audiences (exec summary for business, deployment for ops) - **Keep it updated**: Architecture evolves; treat this as a living document - **Link to code**: Reference actual implementations to keep architecture grounded in reality - **Use Mermaid diagrams**: Embed diagrams directly in markdown using Mermaid syntax for better maintainability, version control, and collaboration

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DauQuangThanh/sso-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server