- Add comprehensive health check system with multiple endpoints - Add Prometheus metrics endpoint - Add production logging configurations (5 strategies) - Add complete deployment documentation suite: * QUICKSTART.md - 30-minute deployment guide * DEPLOYMENT_CHECKLIST.md - Printable verification checklist * DEPLOYMENT_WORKFLOW.md - Complete deployment lifecycle * PRODUCTION_DEPLOYMENT.md - Comprehensive technical reference * production-logging.md - Logging configuration guide * ANSIBLE_DEPLOYMENT.md - Infrastructure as Code automation * README.md - Navigation hub * DEPLOYMENT_SUMMARY.md - Executive summary - Add deployment scripts and automation - Add DEPLOYMENT_PLAN.md - Concrete plan for immediate deployment - Update README with production-ready features All production infrastructure is now complete and ready for deployment.
17 KiB
Production Docker Compose Configuration
Production Docker Compose configuration mit Sicherheits-Härtung, Performance-Optimierung und Monitoring für das Custom PHP Framework.
Übersicht
Das Projekt verwendet Docker Compose Overlay-Pattern:
- Base:
docker-compose.yml- Entwicklungsumgebung - Production:
docker-compose.production.yml- Production-spezifische Overrides
Usage
# Production-Stack starten
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
--env-file .env.production \
up -d
# Mit Build (bei Änderungen)
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
--env-file .env.production \
up -d --build
# Stack stoppen
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
down
# Logs anzeigen
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
logs -f [service]
# Service Health Check
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
ps
Production Overrides
1. Web (Nginx) Service
Restart Policy:
restart: always # Automatischer Neustart bei Fehlern
SSL/TLS Configuration:
volumes:
- certbot-conf:/etc/letsencrypt:ro
- certbot-www:/var/www/certbot:ro
- Let's Encrypt Zertifikate via Certbot
- Read-only Mounts für Sicherheit
Health Checks:
healthcheck:
test: ["CMD", "curl", "-f", "https://localhost/health"]
interval: 15s
timeout: 5s
retries: 5
start_period: 30s
- HTTPS Health Check auf
/healthEndpoint - 15 Sekunden Intervall für schnelle Fehler-Erkennung
- 5 Retries vor Service-Nestart
Resource Limits:
deploy:
resources:
limits:
memory: 512M
cpus: '1.0'
reservations:
memory: 256M
cpus: '0.5'
- Nginx ist lightweight, moderate Limits
Logging:
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
compress: "true"
labels: "service,environment"
- JSON-Format für Log-Aggregation (ELK Stack kompatibel)
- 10MB pro Datei, 5 Dateien = 50MB total
- Komprimierte Rotation
2. PHP Service
Restart Policy:
restart: always
Build Configuration:
build:
args:
- ENV=production
- COMPOSER_INSTALL_FLAGS=--no-dev --optimize-autoloader --classmap-authoritative
--no-dev: Keine Development-Dependencies--optimize-autoloader: PSR-4 Optimization--classmap-authoritative: Keine Filesystem-Lookups (Performance)
Environment:
environment:
- APP_ENV=production
- APP_DEBUG=false # DEBUG AUS in Production!
- PHP_MEMORY_LIMIT=512M
- PHP_MAX_EXECUTION_TIME=30
- XDEBUG_MODE=off # Xdebug aus für Performance
Health Checks:
healthcheck:
test: ["CMD", "php-fpm-healthcheck"]
interval: 15s
timeout: 5s
retries: 5
start_period: 30s
- PHP-FPM Health Check via Custom Script
- Schnelles Failure-Detection
Resource Limits:
deploy:
resources:
limits:
memory: 1G
cpus: '2.0'
reservations:
memory: 512M
cpus: '1.0'
- PHP benötigt mehr Memory als Nginx
- 2 CPUs für parallele Request-Verarbeitung
Volumes:
volumes:
- storage-logs:/var/www/html/storage/logs:rw
- storage-cache:/var/www/html/storage/cache:rw
- storage-queue:/var/www/html/storage/queue:rw
- storage-discovery:/var/www/html/storage/discovery:rw
- storage-uploads:/var/www/html/storage/uploads:rw
- Nur notwendige Docker Volumes
- KEINE Host-Mounts für Sicherheit
- Application Code im Image (nicht gemountet)
3. Database (PostgreSQL 16) Service
Restart Policy:
restart: always
Production Configuration:
volumes:
- db_data:/var/lib/postgresql/data
- ./docker/postgres/postgresql.production.conf:/etc/postgresql/postgresql.conf:ro
- ./docker/postgres/init:/docker-entrypoint-initdb.d:ro
- Production-optimierte
postgresql.production.conf - Init-Scripts für Schema-Setup
Resource Limits:
deploy:
resources:
limits:
memory: 2G
cpus: '2.0'
reservations:
memory: 1G
cpus: '1.0'
- PostgreSQL benötigt Memory für
shared_buffers(2GB in Config) - 2 CPUs für parallele Query-Verarbeitung
Health Checks:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USERNAME:-postgres} -d ${DB_DATABASE:-michaelschiemer}"]
interval: 10s
timeout: 3s
retries: 5
start_period: 30s
pg_isreadyfür schnelle Connection-Prüfung- 10 Sekunden Intervall (häufiger als andere Services)
Logging:
logging:
driver: json-file
options:
max-size: "20m" # Größere Log-Dateien für PostgreSQL
max-file: "10"
compress: "true"
- PostgreSQL loggt mehr (Slow Queries, Checkpoints, etc.)
- 20MB pro Datei, 10 Dateien = 200MB total
4. Redis Service
Restart Policy:
restart: always
Resource Limits:
deploy:
resources:
limits:
memory: 512M
cpus: '1.0'
reservations:
memory: 256M
cpus: '0.5'
- Redis ist Memory-basiert, moderate Limits
Health Checks:
healthcheck:
test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
interval: 10s
timeout: 3s
retries: 5
start_period: 10s
redis-cli pingfür Connection-Check- Schneller Start (10s start_period)
5. Queue Worker Service
Restart Policy:
restart: always
Environment:
environment:
- APP_ENV=production
- WORKER_DEBUG=false
- WORKER_SLEEP_TIME=100000
- WORKER_MAX_JOBS=10000
- Production-Modus ohne Debug
- 10,000 Jobs pro Worker-Lifecycle
Resource Limits:
deploy:
resources:
limits:
memory: 2G
cpus: '2.0'
reservations:
memory: 1G
cpus: '1.0'
replicas: 2 # 2 Worker-Instanzen
- Worker benötigen Memory für Job-Processing
- 2 Replicas für Parallelität
Graceful Shutdown:
stop_grace_period: 60s
- 60 Sekunden für Job-Completion vor Shutdown
- Verhindert Job-Abbrüche
Logging:
logging:
driver: json-file
options:
max-size: "20m"
max-file: "10"
compress: "true"
- Worker loggen ausführlich (Job-Start, Completion, Errors)
- 200MB total Log-Storage
6. Certbot Service
Restart Policy:
restart: always
Auto-Renewal:
entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew --webroot -w /var/www/certbot --quiet; sleep 12h & wait $${!}; done;'"
- Automatische Erneuerung alle 12 Stunden
- Webroot-Challenge über Nginx
Volumes:
volumes:
- certbot-conf:/etc/letsencrypt
- certbot-www:/var/www/certbot
- certbot-logs:/var/log/letsencrypt
- Zertifikate werden mit Nginx geteilt
Network Configuration
Security Isolation:
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true # Backend network is internal (no internet access)
cache:
driver: bridge
internal: true # Cache network is internal
Network-Segmentierung:
- Frontend: Nginx, Certbot (Internet-Zugriff)
- Backend: PHP, PostgreSQL, Queue Worker (KEIN Internet-Zugriff)
- Cache: Redis (KEIN Internet-Zugriff)
Security Benefits:
- Backend Services können nicht nach außen kommunizieren
- Verhindert Data Exfiltration bei Compromise
- Zero-Trust Network Architecture
Volumes Configuration
SSL/TLS Volumes:
certbot-conf:
driver: local
certbot-www:
driver: local
certbot-logs:
driver: local
Application Storage Volumes:
storage-logs:
driver: local
storage-cache:
driver: local
storage-queue:
driver: local
storage-discovery:
driver: local
storage-uploads:
driver: local
Database Volume:
db_data:
driver: local
# Optional: External volume for backups
# driver_opts:
# type: none
# o: bind
# device: /mnt/db-backups/michaelschiemer-prod
Volume Best Practices:
- Alle Volumes sind
driver: local(nicht Host-Mounts) - Für Backups: Optional External Volume für Database
- Keine Development-Host-Mounts in Production
Logging Strategy
JSON Logging für alle Services:
logging:
driver: json-file
options:
max-size: "10m" # Service-abhängig
max-file: "5" # Service-abhängig
compress: "true"
labels: "service,environment"
Log Rotation:
| Service | Max Size | Max Files | Total Storage |
|---|---|---|---|
| Nginx | 10MB | 5 | 50MB |
| PHP | 10MB | 10 | 100MB |
| PostgreSQL | 20MB | 10 | 200MB |
| Redis | 10MB | 5 | 50MB |
| Queue Worker | 20MB | 10 | 200MB |
| Certbot | 5MB | 3 | 15MB |
| TOTAL | 615MB |
Log Aggregation:
- JSON-Format für ELK Stack (Elasticsearch, Logstash, Kibana)
- Labels für Service-Identifikation
- Komprimierte Log-Files für Storage-Effizienz
Resource Allocation
Total Resource Requirements:
| Service | Memory Limit | Memory Reservation | CPU Limit | CPU Reservation |
|---|---|---|---|---|
| Nginx | 512M | 256M | 1.0 | 0.5 |
| PHP | 1G | 512M | 2.0 | 1.0 |
| PostgreSQL | 2G | 1G | 2.0 | 1.0 |
| Redis | 512M | 256M | 1.0 | 0.5 |
| Queue Worker (x2) | 4G | 2G | 4.0 | 2.0 |
| TOTAL | 8GB | 4GB | 10 CPUs | 5 CPUs |
Server Sizing Recommendations:
- Minimum: 8GB RAM, 4 CPUs (Resource Limits)
- Recommended: 16GB RAM, 8 CPUs (Headroom für OS und Spikes)
- Optimal: 32GB RAM, 16 CPUs (Production mit Monitoring)
Health Checks
Health Check Strategy:
| Service | Endpoint | Interval | Timeout | Retries | Start Period |
|---|---|---|---|---|---|
| Nginx | HTTPS /health | 15s | 5s | 5 | 30s |
| PHP | php-fpm-healthcheck | 15s | 5s | 5 | 30s |
| PostgreSQL | pg_isready | 10s | 3s | 5 | 30s |
| Redis | redis-cli ping | 10s | 3s | 5 | 10s |
Health Check Benefits:
- Automatische Service-Recovery bei Failures
- Docker orchestriert Neustart nur bei unhealthy Services
- Health-Status via
docker-compose ps
Deployment Workflow
Initial Deployment
# 1. Server vorbereiten (siehe production-prerequisites.md)
# 2. .env.production konfigurieren (siehe env-production-template.md)
# 3. Build und Deploy
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
--env-file .env.production \
up -d --build
# 4. SSL Zertifikate initialisieren
docker exec php php console.php ssl:init
# 5. Database Migrationen
docker exec php php console.php db:migrate
# 6. Health Checks verifizieren
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
ps
Rolling Update (Zero-Downtime)
# 1. Neue Version pullen
git pull origin main
# 2. Build neue Images
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
--env-file .env.production \
build --no-cache
# 3. Rolling Update (Service für Service)
# Nginx
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
up -d --no-deps web
# PHP (nach Nginx)
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
up -d --no-deps php
# Queue Worker (nach PHP)
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
up -d --no-deps --scale queue-worker=2 queue-worker
# 4. Health Checks verifizieren
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
ps
Rollback Strategy
# 1. Previous Git Commit
git log --oneline -5
git checkout <previous-commit>
# 2. Rebuild und Deploy
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
--env-file .env.production \
up -d --build
# 3. Database Rollback (wenn nötig)
docker exec php php console.php db:rollback 1
Monitoring
Container Status
# Status aller Services
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
ps
# Detaillierte Informationen
docker inspect <container-name>
Resource Usage
# CPU/Memory Usage
docker stats
# Service-spezifisch
docker stats php db redis
Logs
# Alle Logs (Follow)
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
logs -f
# Service-spezifisch
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
logs -f php
# Letzte N Zeilen
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
logs --tail=100 php
Health Check Status
# Health Check Logs
docker inspect --format='{{json .State.Health}}' php | jq
# Health History
docker inspect --format='{{range .State.Health.Log}}{{.Start}} {{.ExitCode}} {{.Output}}{{end}}' php
Backup Strategy
Database Backup
# Manual Backup
docker exec db pg_dump -U postgres michaelschiemer_prod > backup_$(date +%Y%m%d_%H%M%S).sql
# Automated Backup (Cron)
# /etc/cron.daily/postgres-backup
#!/bin/bash
docker exec db pg_dump -U postgres michaelschiemer_prod | gzip > /mnt/backups/michaelschiemer_$(date +%Y%m%d).sql.gz
Volume Backup
# Backup all volumes
docker run --rm \
-v michaelschiemer_db_data:/data:ro \
-v $(pwd)/backups:/backup \
alpine tar czf /backup/db_data_$(date +%Y%m%d).tar.gz -C /data .
Troubleshooting
Service Won't Start
# Check logs
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
logs <service>
# Check configuration
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
config
Health Check Failing
# Manual health check
docker exec php php-fpm-healthcheck
docker exec db pg_isready -U postgres
docker exec redis redis-cli ping
# Check health logs
docker inspect --format='{{json .State.Health}}' <container> | jq
Memory Issues
# Check memory usage
docker stats
# Increase limits in docker-compose.production.yml
# Then restart service
docker-compose -f docker-compose.yml \
-f docker-compose.production.yml \
up -d --no-deps <service>
Network Issues
# Check networks
docker network ls
docker network inspect michaelschiemer-prod_backend
# Test connectivity
docker exec php ping db
docker exec php nc -zv db 5432
Security Considerations
1. Network Isolation
- ✅ Backend network is internal (no internet access)
- ✅ Cache network is internal
- ✅ Only frontend services expose ports
2. Volume Security
- ✅ No host mounts (application code in image)
- ✅ Read-only mounts where possible (SSL certificates)
- ✅ Named Docker volumes (managed by Docker)
3. Secrets Management
- ✅ Use
.env.production(not committed to git) - ✅ Use Vault for sensitive data
- ✅ No secrets in docker-compose files
4. Resource Limits
- ✅ All services have memory limits (prevent OOM)
- ✅ CPU limits prevent resource starvation
- ✅ Restart policies for automatic recovery
5. Logging
- ✅ JSON logging for security monitoring
- ✅ Log rotation prevents disk exhaustion
- ✅ Compressed logs for storage efficiency
Best Practices
- Always use
.env.production- Never commit production secrets - Test updates in staging first - Use same docker-compose setup
- Monitor resource usage - Adjust limits based on metrics
- Regular backups - Automate database and volume backups
- Health checks - Ensure all services have working health checks
- Log aggregation - Send logs to centralized logging system (ELK)
- SSL renewal - Monitor Certbot logs for renewal issues
- Security updates - Regularly update Docker images
See Also
- Prerequisites:
docs/deployment/production-prerequisites.md - Environment Configuration:
docs/deployment/env-production-template.md - SSL Setup:
docs/deployment/ssl-setup.md - Database Migrations:
docs/deployment/database-migration-strategy.md - Logging Configuration:
docs/deployment/logging-configuration.md