# Production Deployment Troubleshooting Checklist Systematische Problemlösung für häufige Deployment-Issues. ## Issue 1: Supervisor Log File Permission Denied ### Symptom ``` PermissionError: [Errno 13] Permission denied: '/var/log/supervisor/supervisord.log' ``` Container startet nicht, Supervisor kann Logfile nicht schreiben. ### Diagnose ```bash docker logs web # Zeigt Permission Error docker exec web ls -la /var/log/supervisor/ # Directory existiert nicht oder keine Permissions ``` ### Root Cause - Supervisor versucht in `/var/log/supervisor/supervisord.log` zu schreiben - Directory existiert nicht oder keine Write-Permissions - Auch als root problematisch in containerisierter Umgebung ### Lösung 1 (FUNKTIONIERT NICHT) ❌ **Versuch**: `/proc/self/fd/1` verwenden `docker/supervisor/supervisord.conf`: ```ini logfile=/proc/self/fd/1 ``` **Fehler**: `PermissionError: [Errno 13] Permission denied: '/proc/self/fd/1'` **Grund**: Python's logging library (verwendet von Supervisor) kann `/proc/self/fd/1` oder `/dev/stdout` nicht im append-mode öffnen. ### Lösung 2 (ERFOLGREICH) ✅ **Fix**: `/dev/null` mit `silent=false` `docker/supervisor/supervisord.conf`: ```ini [supervisord] nodaemon=true silent=false # WICHTIG: Logging trotz /dev/null logfile=/dev/null logfile_maxbytes=0 pidfile=/var/run/supervisord.pid loglevel=info ``` **Warum funktioniert das?** - `logfile=/dev/null`: Kein File-Logging - `silent=false`: Supervisor loggt nach stdout/stderr - Logs erscheinen in `docker logs web` ### Verification ```bash docker logs web # Output: # 2025-10-28 16:29:59,976 INFO supervisord started with pid 1 # 2025-10-28 16:30:00,980 INFO spawned: 'nginx' with pid 7 # 2025-10-28 16:30:00,982 INFO spawned: 'php-fpm' with pid 8 # 2025-10-28 16:30:02,077 INFO success: nginx entered RUNNING state # 2025-10-28 16:30:02,077 INFO success: php-fpm entered RUNNING state ``` ### Related Files - `docker/supervisor/supervisord.conf` - `Dockerfile.production` (COPY supervisord.conf) --- ## Issue 2: Web Container EACCES Errors ### Symptom ``` 2025-10-28 16:16:52,152 CRIT could not write pidfile /var/run/supervisord.pid 2025-10-28 16:16:53,154 INFO spawnerr: unknown error making dispatchers for 'nginx': EACCES 2025-10-28 16:16:53,154 INFO spawnerr: unknown error making dispatchers for 'php-fpm': EACCES ``` ### Diagnose ```bash # Container User checken docker exec web whoami # Falls nicht "root", dann ist das der Issue # Docker Compose Config checken docker inspect web | grep -i user # Zeigt inherited user von base config ``` ### Root Cause - `web` service in `docker-compose.prod.yml` hat **kein** `user: root` gesetzt - Inherited `user: 1000:1000` oder `user: www-data` von base `docker-compose.yml` - Supervisor benötigt root um nginx/php-fpm master processes zu starten ### Lösung ✅ **Fix**: `user: root` explizit setzen `docker-compose.prod.yml`: ```yaml web: image: 94.16.110.151:5000/framework:latest user: root # ← HINZUFÜGEN # ... rest der config ``` Auch für `php` und `queue-worker` services hinzufügen: ```yaml php: image: 94.16.110.151:5000/framework:latest user: root # ← HINZUFÜGEN queue-worker: image: 94.16.110.151:5000/framework:latest user: root # ← HINZUFÜGEN ``` ### Warum user: root? - **Container läuft als root**: Supervisor master process - **Nginx master**: root (worker processes als www-data via nginx.conf) - **PHP-FPM master**: root (pool workers als www-data via php-fpm.conf) `docker/php/zz-docker.production.conf`: ```ini [www] user = www-data # ← Worker processes laufen als www-data group = www-data ``` ### Verification ```bash docker exec web whoami # root docker exec web ps aux | grep -E 'nginx|php-fpm' # root 1 supervisord # root 7 nginx: master process # www-data 10 nginx: worker process # root 8 php-fpm: master process # www-data 11 php-fpm: pool www ``` ### Related Files - `docker-compose.prod.yml` (web, php, queue-worker services) - `docker/php/zz-docker.production.conf` - `docker/nginx/nginx.production.conf` --- ## Issue 3: Docker Entrypoint Override funktioniert nicht ### Symptom Container command zeigt Entrypoint prepended: ```bash docker ps # COMMAND: "/usr/local/bin/docker-entrypoint.sh /usr/bin/supervisord -c ..." ``` Supervisor wird nicht direkt gestartet, sondern durch einen wrapper script. ### Diagnose ```bash # Container Command checken docker inspect web --format='{{.Config.Entrypoint}}' # [/usr/local/bin/docker-entrypoint.sh] docker inspect web --format='{{.Config.Cmd}}' # [/usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf] ``` ### Root Cause 1. Base `docker-compose.yml` hat `web` service mit separate build: ```yaml web: build: context: docker/nginx dockerfile: Dockerfile ``` 2. Production override setzt `image:` aber cleared **nicht** den inherited ENTRYPOINT: ```yaml web: image: 94.16.110.151:5000/framework:latest command: ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"] ``` 3. Base PHP image hat ENTRYPOINT der prepended wird 4. Docker Compose merge: ENTRYPOINT + CMD = final command ### Lösung - Iteration 1 (FUNKTIONIERT NICHT) ❌ **Versuch**: Nur `command:` setzen `docker-compose.prod.yml`: ```yaml web: image: 94.16.110.151:5000/framework:latest command: ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"] ``` **Result**: Entrypoint wird trotzdem prepended ### Lösung - Iteration 2 (FUNKTIONIERT NICHT) ❌ **Versuch**: `pull_policy: always` hinzufügen `docker-compose.prod.yml`: ```yaml web: image: 94.16.110.151:5000/framework:latest pull_policy: always # Force registry pull command: ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"] ``` **Result**: Image wird von Registry gepullt, aber Entrypoint wird trotzdem prepended ### Lösung - Iteration 3 (ERFOLGREICH) ✅ **Fix**: `entrypoint: []` explizit clearen `docker-compose.prod.yml`: ```yaml web: image: 94.16.110.151:5000/framework:latest pull_policy: always # Always pull from registry, never build entrypoint: [] # ← WICHTIG: Entrypoint clearen command: ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"] user: root ``` **Warum `entrypoint: []`?** - Leeres Array cleared den inherited entrypoint komplett - `command:` wird dann direkt als PID 1 gestartet - Keine wrapper scripts, keine indirection ### Verification ```bash docker inspect web --format='{{.Config.Entrypoint}}' # [] ← Leer! docker inspect web --format='{{.Config.Cmd}}' # [/usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf] docker exec web ps aux # PID 1: /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf # Kein entrypoint wrapper! ``` ### Related Files - `docker-compose.prod.yml` (web service) ### Docker Compose Override Rules ``` Base Config + Override = Final Config Base: web: build: docker/nginx → inherited ENTRYPOINT from base image Override (insufficient): web: image: 94.16.110.151:5000/framework:latest command: [...] → ENTRYPOINT still prepended to command Override (correct): web: image: 94.16.110.151:5000/framework:latest entrypoint: [] ← Clears inherited entrypoint command: [...] ← Runs directly as PID 1 ``` --- ## Issue 4: Queue Worker Container Restarts ### Symptom ```bash docker ps # queue-worker Restarting (1) 5 seconds ago ``` Container restart loop, nie healthy. ### Diagnose ```bash docker logs queue-worker # Error: /var/www/html/worker.php not found # oder # php: command not found ``` ### Root Cause Base `docker-compose.yml` hat Queue Worker Command für Development: ```yaml queue-worker: command: ["php", "/var/www/html/worker.php"] ``` `worker.php` existiert nicht im Production Image. ### Lösung - Option 1: Service deaktivieren ✅ **Quick Fix**: Queue Worker deaktivieren `docker-compose.prod.yml`: ```yaml queue-worker: deploy: replicas: 0 # Disable service ``` ### Lösung - Option 2: Richtigen Command setzen ✅ **Proper Fix**: Console Command verwenden `docker-compose.prod.yml`: ```yaml queue-worker: image: 94.16.110.151:5000/framework:latest user: root command: ["php", "/var/www/html/console.php", "queue:work"] # oder für Supervisor-managed: # entrypoint: [] # command: ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/queue-worker-supervisord.conf"] ``` ### Verification ```bash docker logs queue-worker # [timestamp] INFO Queue worker started # [timestamp] INFO Processing job: ... ``` ### Related Files - `docker-compose.yml` (base queue-worker definition) - `docker-compose.prod.yml` (production override) - `console.php` (framework console application) --- ## Issue 5: HTTP Port 80 nicht erreichbar ### Symptom ```bash curl http://94.16.110.151:8888/ # curl: (7) Failed to connect to 94.16.110.151 port 8888: Connection refused docker exec web curl http://localhost/ # curl: (7) Failed to connect to localhost port 80: Connection refused ``` ### Diagnose ```bash # Nginx listening ports checken docker exec web netstat -tlnp | grep nginx # Zeigt nur: 0.0.0.0:443 # Nginx Config checken docker exec web cat /etc/nginx/http.d/default.conf # Kein "listen 80;" block ``` ### Root Cause - Option 1: Intentional HTTPS-only Möglicherweise ist HTTP absichtlich disabled (Security Best Practice). ### Root Cause - Option 2: Missing HTTP Block Nginx config hat keinen HTTP listener, nur HTTPS. ### Lösung - HTTP→HTTPS Redirect hinzufügen ✅ **Fix**: HTTP Redirect konfigurieren `docker/nginx/default.production.conf`: ```nginx # HTTP → HTTPS Redirect server { listen 80; server_name _; location / { return 301 https://$host$request_uri; } } # HTTPS Server server { listen 443 ssl http2; server_name _; ssl_certificate /var/www/ssl/cert.pem; ssl_certificate_key /var/www/ssl/key.pem; root /var/www/html/public; index index.php; location / { try_files $uri $uri/ /index.php?$query_string; } location ~ \.php$ { fastcgi_pass php:9000; fastcgi_index index.php; include fastcgi_params; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; } } ``` ### Verification ```bash curl -I http://94.16.110.151:8888/ # HTTP/1.1 301 Moved Permanently # Location: https://94.16.110.151:8888/ curl -k -I https://94.16.110.151:8443/ # HTTP/2 200 # server: nginx ``` ### Related Files - `docker/nginx/default.production.conf` - `Dockerfile.production` (COPY nginx config) --- ## General Debugging Commands ### Container Inspection ```bash # Alle Container Status docker-compose -f docker-compose.yml -f docker-compose.prod.yml ps # Container Details docker inspect web # Container Logs docker logs -f web docker logs --tail 100 web # Inside Container docker exec -it web sh docker exec -it php sh ``` ### Supervisor Debugging ```bash # Supervisor Status docker exec web supervisorctl status # Supervisor Logs docker exec web tail -f /dev/null # Logs gehen nach stdout/stderr # Supervisor Config testen docker exec web supervisord -c /etc/supervisor/conf.d/supervisord.conf -n ``` ### Nginx Debugging ```bash # Nginx Config testen docker exec web nginx -t # Nginx reload docker exec web nginx -s reload # Nginx listening ports docker exec web netstat -tlnp | grep nginx # Nginx processes docker exec web ps aux | grep nginx ``` ### PHP-FPM Debugging ```bash # PHP-FPM Status docker exec web curl http://localhost/php-fpm-status # PHP-FPM Config testen docker exec web php-fpm -t # PHP-FPM processes docker exec web ps aux | grep php-fpm # PHP Version docker exec web php -v # PHP Modules docker exec web php -m ``` ### Network Debugging ```bash # Port listening docker exec web netstat -tlnp # DNS resolution docker exec web nslookup db docker exec web nslookup redis # Network connectivity docker exec web ping db docker exec web ping redis # HTTP request docker exec web curl http://localhost/ ``` ### Database Debugging ```bash # PostgreSQL Connection docker exec php php -r "new PDO('pgsql:host=db;dbname=framework_db', 'framework_user', 'password');" # Database Logs docker logs db # Connect to DB docker exec -it db psql -U framework_user -d framework_db # Check connections docker exec db psql -U framework_user -d framework_db -c "SELECT count(*) FROM pg_stat_activity;" ``` ### Performance Monitoring ```bash # Container Resource Usage docker stats # Disk Usage docker system df # Image Sizes docker images # Volume Sizes docker system df -v ``` --- ## Checklist für erfolgreichen Deploy ### Pre-Deployment - [ ] Image gebaut: `docker build -f Dockerfile.production -t 94.16.110.151:5000/framework:latest .` - [ ] Image gepusht: `docker push 94.16.110.151:5000/framework:latest` - [ ] Registry verfügbar: `curl http://94.16.110.151:5000/v2/_catalog` - [ ] WireGuard VPN aktiv: `wg show` - [ ] `.env.production` auf Server aktuell - [ ] `docker-compose.prod.yml` auf Server aktuell ### Deployment - [ ] SSH auf Server: `ssh deploy@94.16.110.151` - [ ] Image pullen: `docker-compose -f docker-compose.yml -f docker-compose.prod.yml pull` - [ ] Stack starten: `docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d` ### Post-Deployment Verification - [ ] Container laufen: `docker-compose ps` zeigt alle "Up (healthy)" - [ ] Supervisor Status: `docker exec web supervisorctl status` zeigt nginx/php-fpm RUNNING - [ ] Nginx lauscht: `docker exec web netstat -tlnp | grep :443` - [ ] PHP-FPM lauscht: `docker exec web netstat -tlnp | grep :9000` - [ ] Application erreichbar: `curl -k -I https://94.16.110.151:8443/` → HTTP/2 200 - [ ] Database erreichbar: `docker exec php php -r "new PDO(...);"` - [ ] Redis erreichbar: `docker exec php php -r "new Redis()->connect('redis', 6379);"` - [ ] Logs sauber: `docker logs web` zeigt keine Errors ### Monitoring - [ ] Prometheus: http://10.8.0.1:9090 erreichbar - [ ] Grafana: http://10.8.0.1:3000 erreichbar - [ ] Portainer: https://10.8.0.1:9443 erreichbar - [ ] Watchtower aktiv: `docker logs watchtower` zeigt Checks --- ## Quick Reference ### Häufigste Fehlerursachen 1. **Supervisor Logging**: Verwende `logfile=/dev/null` + `silent=false` 2. **User Permissions**: Setze `user: root` in docker-compose.prod.yml 3. **Entrypoint Override**: Setze `entrypoint: []` um inherited entrypoint zu clearen 4. **Pull Policy**: Verwende `pull_policy: always` um registry image zu forcen ### Wichtigste Config-Änderungen - `docker/supervisor/supervisord.conf`: `logfile=/dev/null`, `silent=false` - `docker-compose.prod.yml`: `user: root`, `entrypoint: []`, `pull_policy: always` - `docker/php/zz-docker.production.conf`: `user = www-data`, `group = www-data`