# Production Docker Compose Configuration Production Docker Compose configuration mit Sicherheits-Härtung, Performance-Optimierung und Monitoring für das Custom PHP Framework. ## Übersicht Das Projekt verwendet Docker Compose Overlay-Pattern: - **Base**: `docker-compose.yml` - Entwicklungsumgebung - **Production**: `docker-compose.production.yml` - Production-spezifische Overrides ## Usage ```bash # Production-Stack starten docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ --env-file .env.production \ up -d # Mit Build (bei Änderungen) docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ --env-file .env.production \ up -d --build # Stack stoppen docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ down # Logs anzeigen docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ logs -f [service] # Service Health Check docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ ps ``` ## Production Overrides ### 1. Web (Nginx) Service **Restart Policy**: ```yaml restart: always # Automatischer Neustart bei Fehlern ``` **SSL/TLS Configuration**: ```yaml volumes: - certbot-conf:/etc/letsencrypt:ro - certbot-www:/var/www/certbot:ro ``` - Let's Encrypt Zertifikate via Certbot - Read-only Mounts für Sicherheit **Health Checks**: ```yaml healthcheck: test: ["CMD", "curl", "-f", "https://localhost/health"] interval: 15s timeout: 5s retries: 5 start_period: 30s ``` - HTTPS Health Check auf `/health` Endpoint - 15 Sekunden Intervall für schnelle Fehler-Erkennung - 5 Retries vor Service-Nestart **Resource Limits**: ```yaml deploy: resources: limits: memory: 512M cpus: '1.0' reservations: memory: 256M cpus: '0.5' ``` - Nginx ist lightweight, moderate Limits **Logging**: ```yaml logging: driver: json-file options: max-size: "10m" max-file: "5" compress: "true" labels: "service,environment" ``` - JSON-Format für Log-Aggregation (ELK Stack kompatibel) - 10MB pro Datei, 5 Dateien = 50MB total - Komprimierte Rotation ### 2. PHP Service **Restart Policy**: ```yaml restart: always ``` **Build Configuration**: ```yaml build: args: - ENV=production - COMPOSER_INSTALL_FLAGS=--no-dev --optimize-autoloader --classmap-authoritative ``` - `--no-dev`: Keine Development-Dependencies - `--optimize-autoloader`: PSR-4 Optimization - `--classmap-authoritative`: Keine Filesystem-Lookups (Performance) **Environment**: ```yaml environment: - APP_ENV=production - APP_DEBUG=false # DEBUG AUS in Production! - PHP_MEMORY_LIMIT=512M - PHP_MAX_EXECUTION_TIME=30 - XDEBUG_MODE=off # Xdebug aus für Performance ``` **Health Checks**: ```yaml healthcheck: test: ["CMD", "php-fpm-healthcheck"] interval: 15s timeout: 5s retries: 5 start_period: 30s ``` - PHP-FPM Health Check via Custom Script - Schnelles Failure-Detection **Resource Limits**: ```yaml deploy: resources: limits: memory: 1G cpus: '2.0' reservations: memory: 512M cpus: '1.0' ``` - PHP benötigt mehr Memory als Nginx - 2 CPUs für parallele Request-Verarbeitung **Volumes**: ```yaml volumes: - storage-logs:/var/www/html/storage/logs:rw - storage-cache:/var/www/html/storage/cache:rw - storage-queue:/var/www/html/storage/queue:rw - storage-discovery:/var/www/html/storage/discovery:rw - storage-uploads:/var/www/html/storage/uploads:rw ``` - Nur notwendige Docker Volumes - **KEINE Host-Mounts** für Sicherheit - Application Code im Image (nicht gemountet) ### 3. Database (PostgreSQL 16) Service **Restart Policy**: ```yaml restart: always ``` **Production Configuration**: ```yaml volumes: - db_data:/var/lib/postgresql/data - ./docker/postgres/postgresql.production.conf:/etc/postgresql/postgresql.conf:ro - ./docker/postgres/init:/docker-entrypoint-initdb.d:ro ``` - Production-optimierte `postgresql.production.conf` - Init-Scripts für Schema-Setup **Resource Limits**: ```yaml deploy: resources: limits: memory: 2G cpus: '2.0' reservations: memory: 1G cpus: '1.0' ``` - PostgreSQL benötigt Memory für `shared_buffers` (2GB in Config) - 2 CPUs für parallele Query-Verarbeitung **Health Checks**: ```yaml healthcheck: test: ["CMD-SHELL", "pg_isready -U ${DB_USERNAME:-postgres} -d ${DB_DATABASE:-michaelschiemer}"] interval: 10s timeout: 3s retries: 5 start_period: 30s ``` - `pg_isready` für schnelle Connection-Prüfung - 10 Sekunden Intervall (häufiger als andere Services) **Logging**: ```yaml logging: driver: json-file options: max-size: "20m" # Größere Log-Dateien für PostgreSQL max-file: "10" compress: "true" ``` - PostgreSQL loggt mehr (Slow Queries, Checkpoints, etc.) - 20MB pro Datei, 10 Dateien = 200MB total ### 4. Redis Service **Restart Policy**: ```yaml restart: always ``` **Resource Limits**: ```yaml deploy: resources: limits: memory: 512M cpus: '1.0' reservations: memory: 256M cpus: '0.5' ``` - Redis ist Memory-basiert, moderate Limits **Health Checks**: ```yaml healthcheck: test: ["CMD", "redis-cli", "--raw", "incr", "ping"] interval: 10s timeout: 3s retries: 5 start_period: 10s ``` - `redis-cli ping` für Connection-Check - Schneller Start (10s start_period) ### 5. Queue Worker Service **Restart Policy**: ```yaml restart: always ``` **Environment**: ```yaml environment: - APP_ENV=production - WORKER_DEBUG=false - WORKER_SLEEP_TIME=100000 - WORKER_MAX_JOBS=10000 ``` - Production-Modus ohne Debug - 10,000 Jobs pro Worker-Lifecycle **Resource Limits**: ```yaml deploy: resources: limits: memory: 2G cpus: '2.0' reservations: memory: 1G cpus: '1.0' replicas: 2 # 2 Worker-Instanzen ``` - Worker benötigen Memory für Job-Processing - **2 Replicas** für Parallelität **Graceful Shutdown**: ```yaml stop_grace_period: 60s ``` - 60 Sekunden für Job-Completion vor Shutdown - Verhindert Job-Abbrüche **Logging**: ```yaml logging: driver: json-file options: max-size: "20m" max-file: "10" compress: "true" ``` - Worker loggen ausführlich (Job-Start, Completion, Errors) - 200MB total Log-Storage ### 6. Certbot Service **Restart Policy**: ```yaml restart: always ``` **Auto-Renewal**: ```yaml entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew --webroot -w /var/www/certbot --quiet; sleep 12h & wait $${!}; done;'" ``` - Automatische Erneuerung alle 12 Stunden - Webroot-Challenge über Nginx **Volumes**: ```yaml volumes: - certbot-conf:/etc/letsencrypt - certbot-www:/var/www/certbot - certbot-logs:/var/log/letsencrypt ``` - Zertifikate werden mit Nginx geteilt ## Network Configuration **Security Isolation**: ```yaml networks: frontend: driver: bridge backend: driver: bridge internal: true # Backend network is internal (no internet access) cache: driver: bridge internal: true # Cache network is internal ``` **Network-Segmentierung**: - **Frontend**: Nginx, Certbot (Internet-Zugriff) - **Backend**: PHP, PostgreSQL, Queue Worker (KEIN Internet-Zugriff) - **Cache**: Redis (KEIN Internet-Zugriff) **Security Benefits**: - Backend Services können nicht nach außen kommunizieren - Verhindert Data Exfiltration bei Compromise - Zero-Trust Network Architecture ## Volumes Configuration **SSL/TLS Volumes**: ```yaml certbot-conf: driver: local certbot-www: driver: local certbot-logs: driver: local ``` **Application Storage Volumes**: ```yaml storage-logs: driver: local storage-cache: driver: local storage-queue: driver: local storage-discovery: driver: local storage-uploads: driver: local ``` **Database Volume**: ```yaml db_data: driver: local # Optional: External volume for backups # driver_opts: # type: none # o: bind # device: /mnt/db-backups/michaelschiemer-prod ``` **Volume Best Practices**: - Alle Volumes sind `driver: local` (nicht Host-Mounts) - Für Backups: Optional External Volume für Database - Keine Development-Host-Mounts in Production ## Logging Strategy **JSON Logging** für alle Services: ```yaml logging: driver: json-file options: max-size: "10m" # Service-abhängig max-file: "5" # Service-abhängig compress: "true" labels: "service,environment" ``` **Log Rotation**: | Service | Max Size | Max Files | Total Storage | |---------|----------|-----------|---------------| | Nginx | 10MB | 5 | 50MB | | PHP | 10MB | 10 | 100MB | | PostgreSQL | 20MB | 10 | 200MB | | Redis | 10MB | 5 | 50MB | | Queue Worker | 20MB | 10 | 200MB | | Certbot | 5MB | 3 | 15MB | | **TOTAL** | | | **615MB** | **Log Aggregation**: - JSON-Format für ELK Stack (Elasticsearch, Logstash, Kibana) - Labels für Service-Identifikation - Komprimierte Log-Files für Storage-Effizienz ## Resource Allocation **Total Resource Requirements**: | Service | Memory Limit | Memory Reservation | CPU Limit | CPU Reservation | |---------|--------------|-------------------|-----------|-----------------| | Nginx | 512M | 256M | 1.0 | 0.5 | | PHP | 1G | 512M | 2.0 | 1.0 | | PostgreSQL | 2G | 1G | 2.0 | 1.0 | | Redis | 512M | 256M | 1.0 | 0.5 | | Queue Worker (x2) | 4G | 2G | 4.0 | 2.0 | | **TOTAL** | **8GB** | **4GB** | **10 CPUs** | **5 CPUs** | **Server Sizing Recommendations**: - **Minimum**: 8GB RAM, 4 CPUs (Resource Limits) - **Recommended**: 16GB RAM, 8 CPUs (Headroom für OS und Spikes) - **Optimal**: 32GB RAM, 16 CPUs (Production mit Monitoring) ## Health Checks **Health Check Strategy**: | Service | Endpoint | Interval | Timeout | Retries | Start Period | |---------|----------|----------|---------|---------|--------------| | Nginx | HTTPS /health | 15s | 5s | 5 | 30s | | PHP | php-fpm-healthcheck | 15s | 5s | 5 | 30s | | PostgreSQL | pg_isready | 10s | 3s | 5 | 30s | | Redis | redis-cli ping | 10s | 3s | 5 | 10s | **Health Check Benefits**: - Automatische Service-Recovery bei Failures - Docker orchestriert Neustart nur bei unhealthy Services - Health-Status via `docker-compose ps` ## Deployment Workflow ### Initial Deployment ```bash # 1. Server vorbereiten (siehe production-prerequisites.md) # 2. .env.production konfigurieren (siehe env-production-template.md) # 3. Build und Deploy docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ --env-file .env.production \ up -d --build # 4. SSL Zertifikate initialisieren docker exec php php console.php ssl:init # 5. Database Migrationen docker exec php php console.php db:migrate # 6. Health Checks verifizieren docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ ps ``` ### Rolling Update (Zero-Downtime) ```bash # 1. Neue Version pullen git pull origin main # 2. Build neue Images docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ --env-file .env.production \ build --no-cache # 3. Rolling Update (Service für Service) # Nginx docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ up -d --no-deps web # PHP (nach Nginx) docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ up -d --no-deps php # Queue Worker (nach PHP) docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ up -d --no-deps --scale queue-worker=2 queue-worker # 4. Health Checks verifizieren docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ ps ``` ### Rollback Strategy ```bash # 1. Previous Git Commit git log --oneline -5 git checkout # 2. Rebuild und Deploy docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ --env-file .env.production \ up -d --build # 3. Database Rollback (wenn nötig) docker exec php php console.php db:rollback 1 ``` ## Monitoring ### Container Status ```bash # Status aller Services docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ ps # Detaillierte Informationen docker inspect ``` ### Resource Usage ```bash # CPU/Memory Usage docker stats # Service-spezifisch docker stats php db redis ``` ### Logs ```bash # Alle Logs (Follow) docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ logs -f # Service-spezifisch docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ logs -f php # Letzte N Zeilen docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ logs --tail=100 php ``` ### Health Check Status ```bash # Health Check Logs docker inspect --format='{{json .State.Health}}' php | jq # Health History docker inspect --format='{{range .State.Health.Log}}{{.Start}} {{.ExitCode}} {{.Output}}{{end}}' php ``` ## Backup Strategy ### Database Backup ```bash # Manual Backup docker exec db pg_dump -U postgres michaelschiemer_prod > backup_$(date +%Y%m%d_%H%M%S).sql # Automated Backup (Cron) # /etc/cron.daily/postgres-backup #!/bin/bash docker exec db pg_dump -U postgres michaelschiemer_prod | gzip > /mnt/backups/michaelschiemer_$(date +%Y%m%d).sql.gz ``` ### Volume Backup ```bash # Backup all volumes docker run --rm \ -v michaelschiemer_db_data:/data:ro \ -v $(pwd)/backups:/backup \ alpine tar czf /backup/db_data_$(date +%Y%m%d).tar.gz -C /data . ``` ## Troubleshooting ### Service Won't Start ```bash # Check logs docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ logs # Check configuration docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ config ``` ### Health Check Failing ```bash # Manual health check docker exec php php-fpm-healthcheck docker exec db pg_isready -U postgres docker exec redis redis-cli ping # Check health logs docker inspect --format='{{json .State.Health}}' | jq ``` ### Memory Issues ```bash # Check memory usage docker stats # Increase limits in docker-compose.production.yml # Then restart service docker-compose -f docker-compose.yml \ -f docker-compose.production.yml \ up -d --no-deps ``` ### Network Issues ```bash # Check networks docker network ls docker network inspect michaelschiemer-prod_backend # Test connectivity docker exec php ping db docker exec php nc -zv db 5432 ``` ## Security Considerations ### 1. Network Isolation - ✅ Backend network is internal (no internet access) - ✅ Cache network is internal - ✅ Only frontend services expose ports ### 2. Volume Security - ✅ No host mounts (application code in image) - ✅ Read-only mounts where possible (SSL certificates) - ✅ Named Docker volumes (managed by Docker) ### 3. Secrets Management - ✅ Use `.env.production` (not committed to git) - ✅ Use Vault for sensitive data - ✅ No secrets in docker-compose files ### 4. Resource Limits - ✅ All services have memory limits (prevent OOM) - ✅ CPU limits prevent resource starvation - ✅ Restart policies for automatic recovery ### 5. Logging - ✅ JSON logging for security monitoring - ✅ Log rotation prevents disk exhaustion - ✅ Compressed logs for storage efficiency ## Best Practices 1. **Always use `.env.production`** - Never commit production secrets 2. **Test updates in staging first** - Use same docker-compose setup 3. **Monitor resource usage** - Adjust limits based on metrics 4. **Regular backups** - Automate database and volume backups 5. **Health checks** - Ensure all services have working health checks 6. **Log aggregation** - Send logs to centralized logging system (ELK) 7. **SSL renewal** - Monitor Certbot logs for renewal issues 8. **Security updates** - Regularly update Docker images ## See Also - **Prerequisites**: `docs/deployment/production-prerequisites.md` - **Environment Configuration**: `docs/deployment/env-production-template.md` - **SSL Setup**: `docs/deployment/ssl-setup.md` - **Database Migrations**: `docs/deployment/database-migration-strategy.md` - **Logging Configuration**: `docs/deployment/logging-configuration.md`