Files
michaelschiemer/docs/deployment/deployment-automation.md
Michael Schiemer fc3d7e6357 feat(Production): Complete production deployment infrastructure
- Add comprehensive health check system with multiple endpoints
- Add Prometheus metrics endpoint
- Add production logging configurations (5 strategies)
- Add complete deployment documentation suite:
  * QUICKSTART.md - 30-minute deployment guide
  * DEPLOYMENT_CHECKLIST.md - Printable verification checklist
  * DEPLOYMENT_WORKFLOW.md - Complete deployment lifecycle
  * PRODUCTION_DEPLOYMENT.md - Comprehensive technical reference
  * production-logging.md - Logging configuration guide
  * ANSIBLE_DEPLOYMENT.md - Infrastructure as Code automation
  * README.md - Navigation hub
  * DEPLOYMENT_SUMMARY.md - Executive summary
- Add deployment scripts and automation
- Add DEPLOYMENT_PLAN.md - Concrete plan for immediate deployment
- Update README with production-ready features

All production infrastructure is now complete and ready for deployment.
2025-10-25 19:18:37 +02:00

806 lines
20 KiB
Markdown

# Production Deployment Automation
Comprehensive guide to automated production deployment scripts for the Custom PHP Framework.
## Overview
The framework includes three main automation scripts for production operations:
1. **`production-deploy.sh`** - Full deployment automation (initial, update, rollback)
2. **`health-check.sh`** - Comprehensive health monitoring
3. **`backup.sh`** - Automated backup system
All scripts are located in `scripts/` directory and designed for Docker-based production deployments.
## Production Deployment Script
**Location**: `scripts/production-deploy.sh`
### Usage
```bash
# Initial deployment (first time)
./scripts/production-deploy.sh initial
# Update deployment (zero-downtime rolling update)
./scripts/production-deploy.sh update
# Rollback to previous version
./scripts/production-deploy.sh rollback
```
### Initial Deployment
First-time production setup with complete environment initialization:
```bash
# Prerequisites:
# 1. .env.production configured with VAULT_ENCRYPTION_KEY
# 2. docker-compose.production.yml present
# 3. Server meets hardware requirements (8GB RAM, 4 CPUs)
./scripts/production-deploy.sh initial
```
**What it does**:
1. ✅ Checks prerequisites (Docker, Compose, configuration files)
2. ✅ Verifies VAULT_ENCRYPTION_KEY is configured
3. ✅ Builds Docker images with production optimizations
4. ✅ Starts all services (web, php, db, redis, queue-worker, certbot)
5. ✅ Waits for services to be ready (20s)
6. ✅ Runs database migrations
7. ✅ Initializes SSL certificates via PHP console command
8. ✅ Verifies Vault is accessible
9. ✅ Runs health checks with retries (30 attempts)
10. ✅ Displays deployment summary
**Output Example**:
```
[12:34:56] 🚀 Starting initial production deployment...
[12:34:56] Checking prerequisites...
✅ Prerequisites check passed
[12:35:10] Building Docker images...
✅ Docker images built
[12:36:00] Starting Docker services...
[12:36:20] Running database migrations...
✅ Database migrations completed
[12:36:30] Initializing SSL certificates...
✅ SSL certificate initialized
[12:36:40] Running health checks...
✅ Health check passed
========================================
Deployment Summary
========================================
📋 Mode: initial
⏰ Timestamp: 2024-10-25 12:37:00
📁 Project: /home/michael/dev/michaelschiemer
💾 Backup: (none - initial deployment)
🐳 Docker Services:
NAME STATUS
web Up (healthy)
php Up (healthy)
db Up (healthy)
redis Up (healthy)
queue-worker Up (2 replicas)
certbot Up
🔒 Security Checks:
[ ] APP_ENV=production in .env.production
[ ] APP_DEBUG=false in .env.production
[ ] VAULT_ENCRYPTION_KEY configured
[ ] ADMIN_ALLOWED_IPS configured
[ ] SSL certificates valid
✅ 🎉 Initial deployment completed successfully!
```
### Update Deployment (Zero-Downtime)
Rolling update with automatic backup and health checks:
```bash
./scripts/production-deploy.sh update
```
**What it does**:
1. ✅ Checks prerequisites
2.**Creates full backup** (database, .env, storage)
3. ✅ Pulls latest images (if using registry)
4. ✅ Builds new Docker images
5. ✅ Runs database migrations
6.**Rolling restart** with minimal downtime:
- PHP-FPM first (10s wait)
- Web server next (5s wait)
- Queue workers last (graceful shutdown via 60s grace period)
7. ✅ Runs health checks
8. ✅ Cleans up old Docker images
9. ✅ Displays summary
**Zero-Downtime Strategy**:
- **PHP-FPM** restarted first while old web server still serves requests
- **Web Server** restarted after PHP is ready
- **Queue Workers** restarted last with 60s graceful shutdown for jobs to complete
- **Health checks** verify each step before proceeding
**Automatic Rollback on Failure**:
If any step fails, the script automatically rolls back to the backup:
```bash
❌ Health check failed after 30 attempts
[12:40:00] Cleaning up after error...
⚠️ Rolling back to previous version...
[12:40:10] Restoring from backup: /backups/backup_20241025_123456
✅ Database restored
✅ .env restored
✅ Storage restored
✅ Backup restored successfully
```
### Rollback Deployment
Restore from latest backup:
```bash
./scripts/production-deploy.sh rollback
```
**What it does**:
1. ✅ Finds latest backup
2. ✅ Prompts for confirmation
3. ✅ Restores database from backup
4. ✅ Restores .env configuration
5. ✅ Restores storage directory
6. ✅ Restarts all services
7. ✅ Runs health checks
**Interactive Confirmation**:
```
⏪ Starting rollback...
⚠️ Rolling back to: /backups/backup_20241025_123456
Continue? (yes/no): yes
[12:45:00] Restoring from backup...
✅ Database restored
✅ .env restored
✅ Storage restored
✅ Backup restored successfully
✅ Health check passed
✅ 🎉 Rollback completed successfully!
```
### Backup Strategy
Backups are created automatically during update deployments:
**Backup Location**: `../backups/backup_YYYYMMDD_HHMMSS_*`
**Backup Contents**:
- `backup_YYYYMMDD_HHMMSS_database.sql.gz` - PostgreSQL database dump
- `backup_YYYYMMDD_HHMMSS_env` - .env configuration
- `backup_YYYYMMDD_HHMMSS_storage.tar.gz` - Storage directory (logs, cache, queue)
**Retention Policy**: Last 5 backups are retained, older backups are automatically cleaned up.
### Error Handling
The script includes comprehensive error handling:
```bash
# Automatic cleanup on error
trap cleanup_on_error ERR
cleanup_on_error() {
log "Cleaning up after error..."
if [[ -d "$BACKUP_PATH" ]]; then
warning "Rolling back to previous version..."
restore_backup "$BACKUP_PATH"
fi
}
```
**Common Errors**:
1. **Missing VAULT_ENCRYPTION_KEY**:
```
❌ VAULT_ENCRYPTION_KEY not configured in .env.production
```
**Fix**: Generate key with `docker exec php php console.php vault:generate-key`
2. **Prerequisites not met**:
```
❌ Docker is not installed
```
**Fix**: Install Docker and Docker Compose
3. **Health check failed**:
```
❌ Health check failed after 30 attempts
```
**Fix**: Check logs with `docker compose logs -f --tail=100`
---
## Health Check Script
**Location**: `scripts/health-check.sh`
### Usage
```bash
# Basic health check
./scripts/health-check.sh
# Verbose output
./scripts/health-check.sh --verbose
# JSON output (for monitoring systems)
./scripts/health-check.sh --json
```
### Health Check Components
The script performs 12 comprehensive health checks:
1. **Docker Daemon** - Verifies Docker is running
2. **Docker Services** - Checks all 5 services (web, php, db, redis, queue-worker)
3. **Web Response** - HTTP response from Nginx (3 retries)
4. **Health Endpoint** - `/health` endpoint availability
5. **Database** - PostgreSQL connectivity via `pg_isready`
6. **Redis** - Redis ping command
7. **SSL Certificate** - Certificate validity via PHP console
8. **Vault** - Vault accessibility check
9. **Disk Space** - Disk usage monitoring (warn >80%, critical >90%)
10. **Memory** - System memory usage (warn >80%, critical >90%)
11. **Queue Workers** - Verify 2 workers running
12. **Recent Errors** - Log analysis for error frequency
### Output Format
**Standard Output**:
```
[12:50:00] 🔍 Starting production health check...
[12:50:01] Checking Docker daemon...
✅ Docker daemon is running
[12:50:02] Checking Docker Compose services...
✅ All Docker services are running
[12:50:03] Checking web server response...
✅ Web server is responding
[12:50:04] Checking /health endpoint...
✅ Health endpoint is responding
[12:50:05] Checking database connectivity...
✅ Database is accepting connections
[12:50:06] Checking Redis connectivity...
✅ Redis is responding
[12:50:07] Checking SSL certificate...
✅ SSL certificate is valid
[12:50:08] Checking Vault connectivity...
✅ Vault is accessible
[12:50:09] Checking disk space...
✅ Disk space usage: 45%
[12:50:10] Checking memory usage...
✅ Memory usage: 62%
[12:50:11] Checking queue workers...
✅ Queue workers: 2 running
[12:50:12] Checking recent errors in logs...
✅ Recent errors: 2 (last 1000 lines)
========================================
Production Health Check Summary
========================================
📊 Health Status:
✅ Healthy: 12
⚠️ Warnings: 0
❌ Unhealthy: 0
Overall Status: HEALTHY ✅
🎉 All critical systems are operational
========================================
```
**JSON Output** (for monitoring systems):
```json
{
"timestamp": "2024-10-25T12:50:12+00:00",
"overall_status": "healthy",
"checks": {
"docker": "healthy",
"service_web": "healthy",
"service_php": "healthy",
"service_db": "healthy",
"service_redis": "healthy",
"service_queue-worker": "healthy",
"web_response": "healthy",
"health_endpoint": "healthy",
"database": "healthy",
"redis": "healthy",
"ssl": "healthy",
"vault": "healthy",
"disk_space": "healthy",
"memory": "healthy",
"queue_workers": "healthy",
"recent_errors": "healthy"
}
}
```
### Verbose Mode
Provides additional details:
```bash
./scripts/health-check.sh --verbose
```
**Additional Information**:
- Active database connections
- Redis memory usage
- Full SSL certificate status
- Detailed error log excerpt
### Exit Codes
- **0** - All checks healthy
- **1** - One or more critical checks failed
### Integration with Monitoring
**Cron Job** (every 5 minutes):
```cron
*/5 * * * * /path/to/scripts/health-check.sh --json > /var/log/health-check.json
```
**Alerting Integration**:
```bash
# Send alert if unhealthy
if ! ./scripts/health-check.sh &>/dev/null; then
./scripts/send-alert.sh "Production health check failed"
fi
```
---
## Backup Script
**Location**: `scripts/backup.sh`
### Usage
```bash
# Full backup (database + vault + files)
./scripts/backup.sh --full
# Database only
./scripts/backup.sh --database-only
# Vault only
./scripts/backup.sh --vault-only
# Encrypted backup (GPG)
./scripts/backup.sh --full --encrypt
```
### Backup Components
1. **Database Backup**
- PostgreSQL dump via `pg_dump`
- Gzipped for compression
- Optional GPG encryption
2. **Vault Backup**
- Vault secrets table (`vault_secrets`)
- Vault audit table (`vault_audit`)
- **Encryption highly recommended**
3. **Environment Configuration**
- `.env.production` file backup
- Contains sensitive configuration
4. **Storage Directory**
- Logs, cache, queue, discovery, uploads
- Tar.gz compression
5. **Uploaded Files**
- `public/uploads/` directory
- Media and user-uploaded content
### Backup Process
```bash
./scripts/backup.sh --full --encrypt
```
**Output**:
```
[13:00:00] 🔐 Starting production backup (type: full)...
[13:00:01] Preparing backup directory...
✅ Backup directory created: /backups/20241025_130000
[13:00:02] Backing up database...
✅ Database backup created: database.sql.gz (245M)
[13:00:05] Encrypting /backups/20241025_130000/database.sql.gz...
✅ File encrypted: database.sql.gz.gpg
[13:00:10] Backing up Vault secrets...
✅ Vault backup created: vault_secrets.sql.gz (2.3M)
⚠️ Vault backup is not encrypted - consider using --encrypt
[13:00:12] Backing up environment configuration...
✅ Environment configuration backed up
[13:00:13] Backing up storage directory...
✅ Storage backup created: storage.tar.gz (120M)
[13:00:18] Backing up uploaded files...
✅ Uploads backup created: uploads.tar.gz (1.5G)
[13:00:45] Creating backup manifest...
✅ Backup manifest created
[13:00:46] Verifying backup integrity...
✓ database.sql.gz.gpg is valid
✓ storage.tar.gz is valid
✓ uploads.tar.gz is valid
✅ All backup files verified successfully
[13:00:47] Cleaning up old backups...
✅ Old backups cleaned up (kept last 7 days)
========================================
Backup Summary
========================================
📋 Backup Type: full
⏰ Timestamp: 2024-10-25 13:00:47
📁 Location: /backups/20241025_130000
🔒 Encrypted: true
📦 Backup Contents:
1.5G uploads.tar.gz
245M database.sql.gz.gpg
120M storage.tar.gz
2.3M vault_secrets.sql.gz
💾 Total Size: 1.9G
📝 Restoration Commands:
Database:
gpg -d database.sql.gz.gpg | gunzip | docker compose exec -T db psql -U postgres michaelschiemer_prod
Vault:
gunzip -c vault_secrets.sql.gz | docker compose exec -T db psql -U postgres michaelschiemer_prod
Storage:
tar -xzf storage.tar.gz -C /path/to/project
========================================
✅ 🎉 Backup completed successfully!
```
### Backup Encryption
GPG symmetric encryption (AES-256):
```bash
./scripts/backup.sh --full --encrypt
```
**What gets encrypted**:
- Database dumps
- Vault backups
- Environment configuration
**Decryption**:
```bash
# Decrypt file
gpg -d database.sql.gz.gpg > database.sql.gz
# Restore database
gunzip -c database.sql.gz | docker compose exec -T db psql -U postgres michaelschiemer_prod
```
**Encryption Password**:
- Prompted during backup
- **Store securely** in password manager
- Required for restoration
### Backup Retention
**Automatic Cleanup**:
- Backups older than 7 days are automatically deleted
- Keeps last 7 days of backups
- Configurable in script
**Manual Cleanup**:
```bash
# Remove specific backup
rm -rf /backups/20241025_130000
# Remove all backups older than 30 days
find /backups -type d -name "20*" -mtime +30 -exec rm -rf {} \;
```
### Backup Verification
All backups are automatically verified:
**Verification Checks**:
- Gzip integrity (`gzip -t`)
- Tar.gz integrity (`tar -tzf`)
- File completeness
**Failed Verification**:
```
✗ database.sql.gz is corrupted
❌ Some backup files are corrupted
```
### Restoration Procedures
**Full System Restoration**:
1. **Restore Database**:
```bash
cd /backups/20241025_130000
# If encrypted
gpg -d database.sql.gz.gpg | gunzip | docker compose exec -T db psql -U postgres michaelschiemer_prod
# If not encrypted
gunzip -c database.sql.gz | docker compose exec -T db psql -U postgres michaelschiemer_prod
```
2. **Restore Vault**:
```bash
gunzip -c vault_secrets.sql.gz | docker compose exec -T db psql -U postgres michaelschiemer_prod
```
3. **Restore Environment**:
```bash
cp env.production /path/to/project/.env.production
```
4. **Restore Storage**:
```bash
tar -xzf storage.tar.gz -C /path/to/project
```
5. **Restore Uploads**:
```bash
tar -xzf uploads.tar.gz -C /path/to/project/public
```
6. **Restart Services**:
```bash
docker compose -f docker-compose.yml -f docker-compose.production.yml --env-file .env.production restart
```
### Automated Backup Schedule
**Recommended Cron Jobs**:
```cron
# Daily full backup at 2 AM (encrypted)
0 2 * * * /path/to/scripts/backup.sh --full --encrypt >> /var/log/backup.log 2>&1
# Hourly database backup
0 * * * * /path/to/scripts/backup.sh --database-only >> /var/log/backup.log 2>&1
# Weekly Vault backup (encrypted)
0 3 * * 0 /path/to/scripts/backup.sh --vault-only --encrypt >> /var/log/backup.log 2>&1
```
**Backup Monitoring**:
```bash
# Check if backup succeeded
if ! tail -1 /var/log/backup.log | grep -q "completed successfully"; then
./scripts/send-alert.sh "Backup failed"
fi
```
---
## Integration with Production Workflow
### Complete Production Deployment Workflow
**Step 1: Initial Setup**
```bash
# Prerequisites
1. Configure .env.production with all required values
2. Generate VAULT_ENCRYPTION_KEY: docker exec php php console.php vault:generate-key
3. Update ADMIN_ALLOWED_IPS for IP-based access control
4. Configure SSL_DOMAIN and SSL_EMAIL for Let's Encrypt
# Deploy
./scripts/production-deploy.sh initial
```
**Step 2: Verify Deployment**
```bash
# Run health check
./scripts/health-check.sh --verbose
# Check logs
docker compose logs -f --tail=100
# Test application
curl -H "User-Agent: Mozilla/5.0" https://your-domain.com/health
```
**Step 3: Create Initial Backup**
```bash
# Full encrypted backup
./scripts/backup.sh --full --encrypt
```
**Step 4: Regular Updates**
```bash
# Zero-downtime update
./scripts/production-deploy.sh update
# Verify health
./scripts/health-check.sh
```
### Automated Operations
**Daily Operations Cron**:
```cron
# Health check every 5 minutes
*/5 * * * * /path/to/scripts/health-check.sh --json > /var/log/health-check.json
# Full backup daily at 2 AM
0 2 * * * /path/to/scripts/backup.sh --full --encrypt >> /var/log/backup.log 2>&1
# Cleanup old logs at 3 AM
0 3 * * * find /path/to/project/storage/logs -name "*.log" -mtime +30 -delete
```
### Monitoring Integration
**Prometheus Metrics** (from health-check.sh JSON output):
```yaml
# prometheus.yml
scrape_configs:
- job_name: 'health-check'
static_configs:
- targets: ['localhost:9090']
metrics_path: '/metrics'
file_sd_configs:
- files:
- '/var/log/health-check.json'
```
**Grafana Dashboard**:
- Service status panels
- Resource usage graphs
- Error rate trends
- SSL certificate expiry countdown
---
## Troubleshooting
### Deployment Script Issues
**Problem**: Prerequisites check fails
```
❌ .env.production not found
```
**Solution**: Copy `.env.example` to `.env.production` and configure all values
**Problem**: Health check fails
```
❌ Health check failed after 30 attempts
```
**Solutions**:
1. Check Docker logs: `docker compose logs -f php`
2. Verify all services are up: `docker compose ps`
3. Check firewall: `sudo ufw status`
4. Verify DNS: `dig your-domain.com`
**Problem**: SSL initialization fails
```
❌ SSL initialization failed
```
**Solutions**:
1. Verify DNS A record points to server
2. Check ports 80/443 are open
3. Verify SSL_DOMAIN in .env.production
4. Check Certbot logs: `docker compose logs certbot`
### Health Check Script Issues
**Problem**: Web response check fails
```
❌ Web server is not responding
```
**Solutions**:
1. Check Nginx status: `docker compose ps web`
2. Check Nginx logs: `docker compose logs web`
3. Verify SSL certificates: `docker compose exec php php console.php ssl:status`
**Problem**: Database check fails
```
❌ Database is not accepting connections
```
**Solutions**:
1. Check PostgreSQL status: `docker compose ps db`
2. Check PostgreSQL logs: `docker compose logs db`
3. Verify database credentials in .env.production
### Backup Script Issues
**Problem**: Database backup fails
```
❌ Database backup failed
```
**Solutions**:
1. Check database container: `docker compose ps db`
2. Verify database credentials
3. Check disk space: `df -h`
**Problem**: GPG encryption fails
```
⚠️ GPG not installed - skipping encryption
```
**Solution**: Install GPG: `sudo apt install gnupg`
**Problem**: Backup verification fails
```
✗ database.sql.gz is corrupted
```
**Solutions**:
1. Run backup again
2. Check disk space
3. Verify database health
---
## Best Practices
### Deployment Best Practices
1. **Always test in staging first**
2. **Run health check before deployment**
3. **Create backup before updates**
4. **Monitor logs during deployment**
5. **Verify health after deployment**
6. **Have rollback plan ready**
7. **Document all configuration changes**
### Backup Best Practices
1. **Encrypt Vault backups** (always)
2. **Store backups off-site** (AWS S3, external server)
3. **Test restoration regularly** (monthly)
4. **Verify backup integrity** (automated)
5. **Monitor backup size growth**
6. **Rotate encryption passwords** (quarterly)
7. **Keep multiple backup copies** (3-2-1 rule: 3 copies, 2 media types, 1 off-site)
### Monitoring Best Practices
1. **Run health checks every 5 minutes**
2. **Alert on critical failures** (email, Slack, PagerDuty)
3. **Monitor resource usage trends**
4. **Track deployment success rate**
5. **Review logs daily**
6. **Set up uptime monitoring** (UptimeRobot, Pingdom)
7. **Document incident responses**
---
## See Also
- **Prerequisites**: `docs/deployment/production-prerequisites.md`
- **Environment Configuration**: `docs/deployment/env-production-template.md`
- **Docker Compose Production**: `docs/deployment/docker-compose-production.md`
- **Database Migrations**: `docs/deployment/database-migration-strategy.md`
- **SSL Setup**: `docs/deployment/ssl-setup.md`
- **Secrets Management**: `docs/deployment/secrets-management.md`
- **Logging Configuration**: `docs/deployment/logging-configuration.md`