Files
michaelschiemer/deployment/legacy/NEW_ARCHITECTURE.md
2025-11-24 21:28:25 +01:00

739 lines
21 KiB
Markdown

# New Deployment Architecture
**Created**: 2025-11-24
**Status**: Design Phase - Implementation Pending
## Executive Summary
This document defines the redesigned deployment architecture using Docker Compose for all environments (local, staging, production). The architecture addresses all issues identified in `legacy/ARCHITECTURE_ANALYSIS.md` and provides a clear, maintainable deployment strategy.
## Architecture Principles
### 1. Docker Compose for All Environments
- **No Docker Swarm**: Use Docker Compose exclusively for simplicity
- **Environment-Specific Files**: One `docker-compose.{env}.yml` per environment
- **Shared Base**: Common configuration in `docker-compose.base.yml`
- **Override Pattern**: Environment files override base configuration
### 2. Clear Separation of Concerns
- **Ansible**: Server provisioning ONLY (install Docker, setup users, configure firewall)
- **Gitea Actions**: Application deployment via CI/CD pipelines
- **Docker Compose**: Runtime orchestration and service management
### 3. Explicit Configuration
- **Absolute Paths**: No relative paths in volume mounts
- **Named Volumes**: For persistent data (databases, caches)
- **Environment Variables**: Clear `.env.{environment}` files
- **Docker Secrets**: File-based secrets via `*_FILE` pattern
### 4. Network Isolation
- **traefik-public**: External network for Traefik ingress
- **backend**: Internal network for application services
- **cache**: Isolated network for Redis
- **app-internal**: External network for shared PostgreSQL
## Service Architecture
### Core Services
```
┌─────────────────────────────────────────────────────────────┐
│ Internet │
└───────────────────────────┬─────────────────────────────────┘
┌───────▼────────┐
│ Traefik │ (traefik-public)
│ Reverse Proxy │
└───────┬────────┘
┌───────────────────┼───────────────────┐
│ │ │
┌───▼────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Web │ │ PHP │ │ Queue │
│ Nginx │◄─────│ PHP-FPM │ │ Worker │
└────────┘ └──────┬──────┘ └──────┬──────┘
│ │
(backend network) │ │
│ │
┌──────────────────┼───────────────────┤
│ │ │
┌───▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Redis │ │ PostgreSQL │ │ MinIO │
│ Cache │ │ (External) │ │ Storage │
└──────────┘ └─────────────┘ └─────────────┘
```
### Service Responsibilities
**web** (Nginx):
- Static file serving
- PHP-FPM proxy
- HTTPS termination (via Traefik)
- Security headers
**php** (PHP-FPM):
- Application runtime
- Framework code execution
- Database connections
- Queue job dispatching
**postgres** (PostgreSQL):
- Primary database
- **External Stack**: Shared across environments via `app-internal` network
- Backup automation via separate container
**redis** (Redis):
- Session storage
- Cache layer
- Queue backend
**queue-worker** (PHP CLI):
- Background job processing
- Scheduled task execution
- Async operations
**minio** (S3-compatible storage):
- File uploads
- Asset storage
- Backup storage
**traefik** (Reverse Proxy):
- Dynamic routing
- SSL/TLS termination
- Let's Encrypt automation
- Load balancing
## Environment Specifications
### docker-compose.local.yml (Development)
**Purpose**: Fast local development with debugging enabled
**Key Features**:
- Development ports: 8888:80, 443:443, 5433:5432
- Host volume mounts for live code editing: `./ → /var/www/html`
- Xdebug enabled: `XDEBUG_MODE=debug`
- Debug flags: `APP_DEBUG=true`
- Docker socket access: `/var/run/docker.sock` (for Docker management)
- Relaxed resource limits
**Services**:
```yaml
services:
web:
ports:
- "8888:80"
- "443:443"
environment:
- APP_ENV=development
volumes:
- ./:/var/www/html:cached
restart: unless-stopped
php:
volumes:
- ./:/var/www/html:cached
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- APP_DEBUG=true
- XDEBUG_MODE=debug
- DB_HOST=postgres # External PostgreSQL Stack
- DB_PASSWORD_FILE=/run/secrets/db_user_password
secrets:
- db_user_password
- redis_password
- app_key
networks:
- backend
- app-internal # External PostgreSQL Stack
redis:
command: redis-server --requirepass $(cat /run/secrets/redis_password)
secrets:
- redis_password
```
**Networks**:
- `backend`: Internal communication (web ↔ php)
- `cache`: Redis isolation
- `app-internal`: **External** - connects to PostgreSQL Stack
**Secrets**: File-based in `./secrets/` directory (gitignored)
### docker-compose.staging.yml (Staging)
**Purpose**: Production-like environment for testing deployments
**Key Features**:
- Traefik with Let's Encrypt **staging** certificates
- Production-like resource limits (moderate)
- External PostgreSQL via `app-internal` network
- No host mounts - code baked into Docker image
- Moderate logging (JSON format)
**Services**:
```yaml
services:
web:
image: registry.michaelschiemer.de/web:${GIT_COMMIT}
networks:
- traefik-public
- backend
labels:
- "traefik.enable=true"
- "traefik.http.routers.web-staging.rule=Host(`staging.michaelschiemer.de`)"
- "traefik.http.routers.web-staging.entrypoints=websecure"
- "traefik.http.routers.web-staging.tls.certresolver=letsencrypt-staging"
deploy:
resources:
limits:
memory: 256M
cpus: "0.5"
reservations:
memory: 128M
php:
image: registry.michaelschiemer.de/php:${GIT_COMMIT}
environment:
- APP_ENV=staging
- APP_DEBUG=false
- XDEBUG_MODE=off
- DB_HOST=postgres
- DB_PASSWORD_FILE=/run/secrets/db_user_password_staging
secrets:
- db_user_password_staging
- redis_password_staging
- app_key_staging
networks:
- backend
- app-internal
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
traefik:
image: traefik:v3.0
command:
- "--certificatesresolvers.letsencrypt-staging.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
networks:
- traefik-public
```
**Networks**:
- `traefik-public`: **External** - shared Traefik network
- `backend`: Internal application network
- `app-internal`: **External** - shared PostgreSQL network
**Image Strategy**: Pre-built images from Gitea registry, tagged with Git commit SHA
### docker-compose.prod.yml (Production)
**Purpose**: Hardened production environment with full security
**Key Features**:
- Production SSL certificates (Let's Encrypt production CA)
- Strict security: `APP_DEBUG=false`, `XDEBUG_MODE=off`
- Resource limits: production-grade (higher than staging)
- Health checks for all services
- Read-only root filesystem where possible
- No-new-privileges security option
- Comprehensive logging
**Services**:
```yaml
services:
web:
image: registry.michaelschiemer.de/web:${GIT_TAG}
read_only: true
security_opt:
- no-new-privileges:true
networks:
- traefik-public
- backend
labels:
- "traefik.enable=true"
- "traefik.http.routers.web-prod.rule=Host(`michaelschiemer.de`) || Host(`www.michaelschiemer.de`)"
- "traefik.http.routers.web-prod.entrypoints=websecure"
- "traefik.http.routers.web-prod.tls.certresolver=letsencrypt"
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
reservations:
memory: 256M
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
php:
image: registry.michaelschiemer.de/php:${GIT_TAG}
security_opt:
- no-new-privileges:true
environment:
- APP_ENV=production
- APP_DEBUG=false
- XDEBUG_MODE=off
- DB_HOST=postgres
- DB_PASSWORD_FILE=/run/secrets/db_user_password_prod
secrets:
- db_user_password_prod
- redis_password_prod
- app_key_prod
networks:
- backend
- app-internal
deploy:
resources:
limits:
memory: 1G
cpus: "2.0"
reservations:
memory: 512M
healthcheck:
test: ["CMD", "php-fpm-healthcheck"]
interval: 30s
timeout: 10s
retries: 3
traefik:
image: traefik:v3.0
command:
- "--certificatesresolvers.letsencrypt.acme.caserver=https://acme-v02.api.letsencrypt.org/directory"
networks:
- traefik-public
```
**Image Strategy**: Release-tagged images from Gitea registry (semantic versioning)
**Security Hardening**:
- Read-only root filesystem
- No privilege escalation
- AppArmor/SELinux profiles
- Resource quotas enforced
## Volume Strategy
### Named Volumes (Persistent Data)
**Database Volumes**:
```yaml
volumes:
postgres-data:
driver: local
redis-data:
driver: local
minio-data:
driver: local
```
**Characteristics**:
- Managed by Docker
- Persisted across container restarts
- Backed up regularly
### Bind Mounts (Development Only)
**Local Development**:
```yaml
volumes:
- /absolute/path/to/project:/var/www/html:cached
- /absolute/path/to/storage/logs:/var/www/html/storage/logs:rw
```
**Rules**:
- **Absolute paths ONLY** - no relative paths
- Development environment only
- Not used in staging/production
### Volume Mount Patterns
**Application Code**:
- **Local**: Bind mount (`./:/var/www/html`) for live editing
- **Staging/Prod**: Baked into Docker image (no mount)
**Logs**:
- **All Environments**: Named volume or bind mount to host for persistence
**Uploads/Assets**:
- **All Environments**: MinIO for S3-compatible storage
## Secret Management
### Docker Secrets via File Pattern
**Framework Support**: Custom PHP Framework supports `*_FILE` environment variable pattern
**Example**:
```yaml
# Environment variable points to secret file
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password
# Secret definition
secrets:
db_password:
file: ./secrets/db_password.txt
```
### Secret Files Structure
```
deployment/
├── secrets/ # Gitignored!
│ ├── local/
│ │ ├── db_password.txt
│ │ ├── redis_password.txt
│ │ └── app_key.txt
│ ├── staging/
│ │ ├── db_password.txt
│ │ ├── redis_password.txt
│ │ └── app_key.txt
│ └── production/
│ ├── db_password.txt
│ ├── redis_password.txt
│ └── app_key.txt
```
**Security**:
- **NEVER commit secrets** to version control
- Add `secrets/` to `.gitignore`
- Use Ansible Vault or external secret manager for production secrets
- Rotate secrets regularly
### Framework Integration
Framework automatically loads secrets via `EncryptedEnvLoader`:
```php
// Framework automatically resolves *_FILE variables
$dbPassword = $env->get('DB_PASSWORD'); // Reads from DB_PASSWORD_FILE
$redisPassword = $env->get('REDIS_PASSWORD'); // Reads from REDIS_PASSWORD_FILE
```
## Environment Variables Strategy
### .env Files per Environment
**Structure**:
```
deployment/
├── .env.local # Local development
├── .env.staging # Staging environment
├── .env.production # Production environment
└── .env.example # Template (committed to git)
```
**Composition Command**:
```bash
# Local
docker compose -f docker-compose.base.yml -f docker-compose.local.yml --env-file .env.local up
# Staging
docker compose -f docker-compose.base.yml -f docker-compose.staging.yml --env-file .env.staging up
# Production
docker compose -f docker-compose.base.yml -f docker-compose.prod.yml --env-file .env.production up
```
### Variable Categories
**Application**:
```bash
APP_ENV=production
APP_DEBUG=false
APP_NAME="Michael Schiemer"
APP_URL=https://michaelschiemer.de
```
**Database**:
```bash
DB_HOST=postgres
DB_PORT=5432
DB_DATABASE=michaelschiemer
DB_USERNAME=postgres
# DB_PASSWORD via secrets: DB_PASSWORD_FILE=/run/secrets/db_password
```
**Cache**:
```bash
REDIS_HOST=redis
REDIS_PORT=6379
# REDIS_PASSWORD via secrets: REDIS_PASSWORD_FILE=/run/secrets/redis_password
```
**Image Tags** (Staging/Production):
```bash
GIT_COMMIT=abc123def456 # Staging
GIT_TAG=v2.1.0 # Production
```
## Service Dependencies and Startup Order
### Dependency Graph
```
traefik (independent)
postgres (external stack)
redis (independent)
php (depends: postgres, redis)
web (depends: php)
queue-worker (depends: postgres, redis)
minio (independent)
```
### docker-compose.yml Dependency Specification
```yaml
services:
php:
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_started
web:
depends_on:
php:
condition: service_started
queue-worker:
depends_on:
php:
condition: service_started
postgres:
condition: service_healthy
redis:
condition: service_started
```
**Health Checks**:
- PostgreSQL: `pg_isready` check
- Redis: `redis-cli PING` check
- PHP-FPM: Custom health check script
- Nginx: `curl http://localhost/health`
## CI/CD Pipeline Design
### Gitea Actions Workflows
**Directory Structure**:
```
.gitea/
└── workflows/
├── build-app.yml # Build & Test
├── deploy-staging.yml # Deploy to Staging
└── deploy-production.yml # Deploy to Production
```
### Workflow 1: Build & Test (`build-app.yml`)
**Triggers**:
- Push to any branch
- Pull request to `develop` or `main`
**Steps**:
1. Checkout code
2. Setup PHP 8.5, Node.js
3. Install dependencies (`composer install`, `npm install`)
4. Run PHP tests (`./vendor/bin/pest`)
5. Run JS tests (`npm test`)
6. Build frontend assets (`npm run build`)
7. Build Docker images (`docker build -t registry.michaelschiemer.de/php:${COMMIT_SHA} .`)
8. Push to Gitea registry
9. Security scan (Trivy)
**Artifacts**: Docker images tagged with Git commit SHA
### Workflow 2: Deploy to Staging (`deploy-staging.yml`)
**Triggers**:
- Merge to `develop` branch (automatic)
- Manual trigger via Gitea UI
**Steps**:
1. Checkout code
2. Pull Docker images from registry (`registry.michaelschiemer.de/php:${COMMIT_SHA}`)
3. SSH to staging server
4. Export environment variables (`GIT_COMMIT=${COMMIT_SHA}`)
5. Run docker compose: `docker compose -f docker-compose.base.yml -f docker-compose.staging.yml --env-file .env.staging up -d`
6. Wait for health checks
7. Run smoke tests
8. Notify via webhook (success/failure)
**Rollback**: Keep previous image tag, redeploy on failure
### Workflow 3: Deploy to Production (`deploy-production.yml`)
**Triggers**:
- Git tag push (e.g., `v2.1.0`) - **manual approval required**
- Manual trigger via Gitea UI
**Steps**:
1. **Manual Approval Gate** - require approval from maintainer
2. Checkout code at tag
3. Pull Docker images from registry (`registry.michaelschiemer.de/php:${GIT_TAG}`)
4. SSH to production server
5. Create backup of current deployment
6. Export environment variables (`GIT_TAG=${TAG}`)
7. Run docker compose: `docker compose -f docker-compose.base.yml -f docker-compose.prod.yml --env-file .env.production up -d`
8. Wait for health checks (extended timeout)
9. Run smoke tests
10. Monitor metrics for 5 minutes
11. Notify via webhook (success/failure)
**Rollback Procedure**:
1. Detect deployment failure (health checks fail)
2. Automatically revert to previous Git tag
3. Run deployment with previous image
4. Notify team of rollback
### Deployment Safety
**Blue-Green Deployment** (Future Enhancement):
- Run new version alongside old version
- Switch traffic via Traefik routing
- Instant rollback by switching back
**Canary Deployment** (Future Enhancement):
- Route 10% traffic to new version
- Monitor error rates
- Gradually increase to 100%
## Network Architecture
### Network Definitions
```yaml
networks:
traefik-public:
external: true
name: traefik-public
backend:
internal: true
driver: bridge
cache:
internal: true
driver: bridge
app-internal:
external: true
name: app-internal
```
### Network Isolation
**traefik-public** (External):
- Services: traefik, web
- Purpose: Ingress from internet
- Isolation: Public-facing only
**backend** (Internal):
- Services: web, php, queue-worker
- Purpose: Application communication
- Isolation: No external access
**cache** (Internal):
- Services: redis
- Purpose: Cache isolation
- Isolation: Only accessible via backend network bridge
**app-internal** (External):
- Services: php, queue-worker, postgres (external stack)
- Purpose: Shared PostgreSQL access across environments
- Isolation: Multi-environment shared resource
### Service Discovery
Docker DNS automatically resolves service names:
- `php` resolves to PHP-FPM container IP
- `redis` resolves to Redis container IP
- `postgres` resolves to external PostgreSQL stack IP
No manual IP configuration required.
## Migration from Legacy System
### Migration Steps
1.**COMPLETED** - Archive legacy deployment to `deployment/legacy/`
2.**COMPLETED** - Document legacy issues in `ARCHITECTURE_ANALYSIS.md`
3.**COMPLETED** - Design new architecture (this document)
4.**NEXT** - Implement `docker-compose.base.yml`
5.**NEXT** - Implement `docker-compose.local.yml`
6.**NEXT** - Test local environment
7.**PENDING** - Implement `docker-compose.staging.yml`
8.**PENDING** - Deploy to staging server
9.**PENDING** - Implement `docker-compose.prod.yml`
10.**PENDING** - Setup Gitea Actions workflows
11.**PENDING** - Deploy to production via CI/CD
### Data Migration
**Database**:
- Export from legacy PostgreSQL: `pg_dump`
- Import to new PostgreSQL: `pg_restore`
- Verify data integrity
**Secrets**:
- Extract secrets from legacy Ansible Vault
- Create new secret files in `deployment/secrets/`
- Update environment variables
**SSL Certificates**:
- Reuse existing Let's Encrypt certificates (copy `acme.json`)
- Or regenerate via Traefik ACME
## Comparison: Legacy vs New
| Aspect | Legacy System | New Architecture |
|--------|---------------|------------------|
| **Orchestration** | Docker Swarm + Docker Compose (confused) | Docker Compose only |
| **Deployment** | Ansible playbooks (unclear responsibility) | Gitea Actions CI/CD |
| **Environment Files** | Scattered stack files (9+ directories) | 3 environment files (local/staging/prod) |
| **Volume Mounts** | Relative paths (causing failures) | Absolute paths + named volumes |
| **Secrets** | Docker Swarm secrets (not working) | File-based secrets via `*_FILE` |
| **Networks** | Unclear dependencies | Explicit network definitions |
| **SSL** | Let's Encrypt (working) | Let's Encrypt (preserved) |
| **PostgreSQL** | Embedded in each stack | External shared stack |
## Benefits of New Architecture
1. **Clarity**: Single source of truth per environment
2. **Maintainability**: Clear separation of concerns (Ansible vs CI/CD)
3. **Debuggability**: Explicit configuration, no hidden magic
4. **Scalability**: Easy to add new environments or services
5. **Security**: File-based secrets, network isolation
6. **CI/CD Integration**: Automated deployments via Gitea Actions
7. **Rollback Safety**: Git-tagged releases, health checks
## Next Steps
1. **Implement Base Configuration**: Create `docker-compose.base.yml`
2. **Test Local Environment**: Verify `docker-compose.local.yml` works
3. **Setup Staging**: Deploy to staging server, test deployment pipeline
4. **Production Deployment**: Manual approval, monitoring
5. **Documentation**: Update README with new deployment procedures
---
**References**:
- Legacy system analysis: `deployment/legacy/ARCHITECTURE_ANALYSIS.md`
- Docker Compose documentation: https://docs.docker.com/compose/
- Traefik v3 documentation: https://doc.traefik.io/traefik/
- Gitea Actions: https://docs.gitea.com/usage/actions/overview