# Docker Swarm + Traefik Deployment Guide Production deployment guide for the Custom PHP Framework using Docker Swarm orchestration with Traefik load balancer. ## Architecture Overview ``` Internet → Traefik (SSL Termination, Load Balancing) ↓ [Web Service - 3 Replicas] ↓ ↓ Database Redis Queue Workers (PostgreSQL) (Cache/Sessions) (2 Replicas) ``` **Key Components**: - **Traefik v2.10**: Reverse proxy, SSL termination, automatic service discovery - **Web Service**: 3 replicas of PHP-FPM + Nginx (HTTP only, Traefik handles HTTPS) - **PostgreSQL 16**: Single instance database (manager node) - **Redis 7**: Sessions and cache (manager node) - **Queue Workers**: 2 replicas for background job processing - **Docker Swarm**: Native container orchestration with rolling updates and health checks ## Prerequisites 1. **Docker Engine 28.0+** with Swarm mode enabled 2. **Production Server** with SSH access 3. **SSL Certificates** in `./ssl/` directory (cert.pem, key.pem) 4. **Environment Variables** in `.env` file on production server 5. **Docker Image** built and available ## Initial Setup ### 1. Initialize Docker Swarm On production server: ```bash docker swarm init ``` Verify: ```bash docker node ls # Should show 1 node as Leader ``` ### 2. Create Docker Secrets Create secrets from .env file values: ```bash cd /home/deploy/framework # Create secrets (one-time setup) echo "$DB_PASSWORD" | docker secret create db_password - echo "$APP_KEY" | docker secret create app_key - echo "$VAULT_ENCRYPTION_KEY" | docker secret create vault_encryption_key - echo "$SHOPIFY_WEBHOOK_SECRET" | docker secret create shopify_webhook_secret - echo "$RAPIDMAIL_PASSWORD" | docker secret create rapidmail_password - ``` Or use the automated script: ```bash ./scripts/setup-production-secrets.sh ``` Verify secrets: ```bash docker secret ls ``` ### 3. Build and Transfer Docker Image On local machine: **Option A: Via Private Registry** (if available): ```bash # Build image docker build -f Dockerfile.production -t 94.16.110.151:5000/framework:latest . # Push to registry docker push 94.16.110.151:5000/framework:latest ``` **Option B: Direct Transfer via SSH** (recommended for now): ```bash # Build image docker build -f Dockerfile.production -t 94.16.110.151:5000/framework:latest . # Save and transfer to production docker save 94.16.110.151:5000/framework:latest | \ ssh -i ~/.ssh/production deploy@94.16.110.151 'docker load' ``` ### 4. Deploy Stack On production server: ```bash cd /home/deploy/framework # Deploy the stack docker stack deploy -c docker-compose.prod.yml framework # Monitor deployment watch docker stack ps framework # Check service status docker stack services framework ``` ## Health Monitoring ### Check Service Status ```bash # List all services docker stack services framework # Check specific service docker service ps framework_web # View service logs docker service logs framework_web -f docker service logs framework_traefik -f docker service logs framework_db -f ``` ### Health Check Endpoints - **Main Health**: http://localhost/health (via Traefik) - **Traefik Dashboard**: http://traefik.localhost:8080 (manager node only) ### Expected Service Replicas | Service | Replicas | Purpose | |---------|----------|---------| | traefik | 1 | Reverse proxy + SSL | | web | 3 | Application servers | | db | 1 | PostgreSQL database | | redis | 1 | Cache + sessions | | queue-worker | 2 | Background jobs | ## Rolling Updates ### Update Application 1. Build new image with updated code: ```bash docker build -f Dockerfile.production -t 94.16.110.151:5000/framework:latest . ``` 2. Transfer to production (if no registry): ```bash docker save 94.16.110.151:5000/framework:latest | \ ssh -i ~/.ssh/production deploy@94.16.110.151 'docker load' ``` 3. Update the service: ```bash # On production server docker service update --image 94.16.110.151:5000/framework:latest framework_web ``` The update will: - Roll out to 1 container at a time (`parallelism: 1`) - Wait 10 seconds between updates (`delay: 10s`) - Start new container before stopping old one (`order: start-first`) - Automatically rollback on failure (`failure_action: rollback`) ### Monitor Update Progress ```bash # Watch update status watch docker service ps framework_web # View update logs docker service logs framework_web -f --tail 50 ``` ### Manual Rollback If needed, rollback to previous version: ```bash docker service rollback framework_web ``` ## Troubleshooting ### Service Won't Start Check service logs: ```bash docker service logs framework_web --tail 100 ``` Check task failures: ```bash docker service ps framework_web --no-trunc ``` ### Container Crashing Inspect individual container: ```bash # Get container ID docker ps -a | grep framework_web # View logs docker logs # Exec into running container docker exec -it bash ``` ### SSL/TLS Issues Traefik handles SSL termination. Check Traefik logs: ```bash docker service logs framework_traefik -f ``` Verify SSL certificates are mounted in docker-compose.prod.yml: ```yaml volumes: - ./ssl:/ssl:ro ``` ### Database Connection Issues Check PostgreSQL health: ```bash docker service logs framework_db --tail 50 # Exec into db container docker exec -it $(docker ps -q -f name=framework_db) psql -U postgres -d framework_prod ``` ### Redis Connection Issues Check Redis availability: ```bash docker service logs framework_redis --tail 50 # Test Redis connection docker exec -it $(docker ps -q -f name=framework_redis) redis-cli ping ``` ### Performance Issues Check resource usage: ```bash # Service resource limits docker service inspect framework_web --format='{{json .Spec.TaskTemplate.Resources}}' | jq # Container stats docker stats ``` ## Scaling ### Scale Web Service ```bash # Scale up to 5 replicas docker service scale framework_web=5 # Scale down to 2 replicas docker service scale framework_web=2 ``` ### Scale Queue Workers ```bash # Scale workers based on queue backlog docker service scale framework_queue-worker=4 ``` ## Cleanup ### Remove Stack ```bash # Remove entire stack docker stack rm framework # Verify removal docker stack ls ``` ### Remove Secrets ```bash # List secrets docker secret ls # Remove specific secret docker secret rm db_password # Remove all framework secrets docker secret ls | grep -E "db_password|app_key|vault_encryption_key" | awk '{print $2}' | xargs docker secret rm ``` ### Leave Swarm ```bash # Force leave Swarm (removes all services and secrets) docker swarm leave --force ``` ## Network Architecture ### Overlay Networks - **traefik-public**: External network for Traefik ↔ Web communication - **backend**: Internal network for Web ↔ Database/Redis communication ### Port Mappings | Port | Service | Purpose | |------|---------|---------| | 80 | Traefik | HTTP (redirects to 443) | | 443 | Traefik | HTTPS (production traffic) | | 8080 | Traefik | Dashboard (manager node only) | ## Volume Management ### Named Volumes | Volume | Purpose | Mounted In | |--------|---------|------------| | traefik-logs | Traefik access logs | traefik | | storage-logs | Application logs | web, queue-worker | | storage-uploads | User uploads | web | | storage-queue | Queue data | queue-worker | | db-data | PostgreSQL data | db | | redis-data | Redis persistence | redis | ### Backup Volumes ```bash # Backup database docker exec $(docker ps -q -f name=framework_db) pg_dump -U postgres framework_prod > backup.sql # Backup Redis (if persistence enabled) docker exec $(docker ps -q -f name=framework_redis) redis-cli --rdb /data/dump.rdb ``` ## Security Best Practices 1. **Secrets Management**: Never commit secrets to version control, use Docker Secrets 2. **Network Isolation**: Backend network is internal-only, no external access 3. **SSL/TLS**: Traefik enforces HTTPS, redirects HTTP → HTTPS 4. **Health Checks**: All services have health checks with automatic restart 5. **Resource Limits**: Production services have memory/CPU limits 6. **Least Privilege**: Containers run as www-data (not root) where possible ## Phase 2 - Monitoring (Coming Soon) - Prometheus for metrics collection - Grafana dashboards - Automated PostgreSQL backups - Email/Slack alerting ## Phase 3 - CI/CD (Coming Soon) - Gitea Actions workflow - Loki + Promtail for log aggregation - Performance tuning ## Phase 4 - High Availability (Future) - Multi-node Swarm cluster - Varnish CDN cache layer - PostgreSQL Primary/Replica with pgpool - MinIO object storage ## References - [Docker Swarm Documentation](https://docs.docker.com/engine/swarm/) - [Traefik v2 Documentation](https://doc.traefik.io/traefik/) - [Docker Secrets Management](https://docs.docker.com/engine/swarm/secrets/)