Files
michaelschiemer/deployment/infrastructure/DEPLOYMENT_ANALYSIS.md
Michael Schiemer 70e45fb56e fix(Discovery): Add comprehensive debug logging for router initialization
- Add initializer count logging in DiscoveryServiceBootstrapper
- Add route structure analysis in RouterSetup
- Add request parameter logging in HttpRouter
- Update PHP production config for better OPcache handling
- Fix various config and error handling improvements
2025-10-27 22:23:18 +01:00

17 KiB

Production Deployment Analysis & Fix Strategy

Date: 2025-10-27 Status: CRITICAL - Production website returning HTTP 500 errors Root Cause: Database connection configuration error (DB_PORT mismatch)


1. Complete Deployment Flow Analysis

Deployment Architecture

The project uses a release-based deployment pattern with shared configuration:

/home/deploy/michaelschiemer/
├── releases/
│   ├── 1761566515/          # Current release (timestamped)
│   ├── 1761565432/          # Previous releases
│   └── ...
├── shared/
│   └── .env.production      # Shared configuration file
└── current -> releases/1761566515/  # Symlink to active release

Key Characteristics:

  • Releases Directory: Each deployment creates a new timestamped release
  • Shared Directory: Configuration files persist across deployments
  • Current Symlink: Points to the active release
  • Symlink Chain: current/.env.productionshared/.env.production → Used by application

.env File Sources (3 Different Files Identified)

1. Root Directory: /home/michael/dev/michaelschiemer/.env.production

  • Size: 2.9K
  • Checksum: 9f33068713432c1dc4008724dc6923b0
  • DB_PORT: 5432 (CORRECT for PostgreSQL)
  • DB_USERNAME: mdb_user (with underscore)
  • DB_PASSWORD: Qo2KNgGqeYksEhKr57pgugakxlothn8J
  • Purpose: Framework default configuration
  • Status: CORRECT database configuration

2. Deployment Directory: /home/michael/dev/michaelschiemer/deployment/applications/environments/.env.production

  • Size: 4.3K
  • Checksum: b516bf86beed813df03a30f655687b72
  • DB_PORT: 5432 (CORRECT for PostgreSQL)
  • DB_USERNAME: mdb_user (with underscore)
  • DB_PASSWORD: Qo2KNgGqeYksEhKr57pgugakxlothn8J
  • Purpose: Application-specific production configuration
  • Status: CORRECT and MORE COMPLETE (includes Redis, Queue, Mail, Monitoring configs)

3. Production Server: /home/deploy/michaelschiemer/shared/.env.production

  • Size: 3.0K (modified Oct 26 20:56)
  • Line 15: DB_PORT=3306 (WRONG - MySQL port instead of PostgreSQL)
  • Line 67: DB_PORT= (duplicate empty entry)
  • DB_USERNAME: mdb-user (with hyphen - likely wrong)
  • DB_PASSWORD: StartSimple2024! (different from local configs)
  • Status: CORRUPTED - Wrong database configuration causing HTTP 500 errors

Deployment Playbook Flow

File: /home/michael/dev/michaelschiemer/deployment/infrastructure/playbooks/deploy-rsync-based.yml

Critical Configuration:

local_project_path: "{{ playbook_dir }}/../../.."  # 3 dirs up = /home/michael/dev/michaelschiemer
shared_files:
  - .env.production  # Marked as SHARED file
rsync_excludes:
  - .env
  - .env.local
  - .env.development

Deployment Steps:

  1. Rsync files from {{ local_project_path }} (framework root) to release directory
    • Excludes: .env, .env.local, .env.development
    • Includes: .env.production from root directory
  2. Create release directory: /home/deploy/michaelschiemer/releases/{{ timestamp }}
  3. Copy files to release directory
  4. Create symlinks:
    • release/.env.production../../shared/.env.production
    • release/.env../../shared/.env.production
  5. Update current symlink → latest release
  6. Restart containers via docker-compose

CRITICAL ISSUE IDENTIFIED: The playbook does NOT have a task to initially copy .env.production to shared/.env.production. It only creates symlinks assuming the file already exists. This means:

  • Initial setup requires MANUAL copy of .env.production to shared/
  • Updates to .env.production require MANUAL sync to production server
  • The rsync'd .env.production in release directory is IGNORED (symlink overrides it)

2. Production Server .env Status

Current State (BROKEN)

# /home/deploy/michaelschiemer/shared/.env.production

Line 15: DB_PORT=3306          # WRONG - MySQL port (should be 5432 for PostgreSQL)
Line 67: DB_PORT=               # Duplicate empty entry

DB_USERNAME=mdb-user            # Wrong format (should be mdb_user with underscore)
DB_PASSWORD=StartSimple2024!    # Wrong password (doesn't match local configs)

Container Status

CONTAINER     STATUS                    ISSUE
php           Up 27 minutes (healthy)   -
db            Up 40 minutes (healthy)   PostgreSQL running on port 5432
redis         Up 40 minutes (healthy)   -
web           Up 40 minutes (UNHEALTHY) Nginx cannot connect to PHP due to DB error
queue-worker  Restarting (1) 4s ago     PHP crashing due to DB connection error

Error Pattern

  • HTTP 500 on all requests (/, /impressum, etc.)
  • Root Cause: PHP application cannot connect to database because:
    1. DB_PORT=3306 (MySQL) instead of 5432 (PostgreSQL)
    2. Wrong username format (mdb-user vs mdb_user)
    3. Wrong password
  • Impact: All PHP processes fail to initialize → Nginx returns 500

3. Deployment Command Documentation

WORKING Commands (Current Playbook)

Deploy via Ansible Playbook

cd /home/michael/dev/michaelschiemer/deployment/infrastructure

# Full production deployment
ansible-playbook \
  -i inventories/production/hosts.yml \
  playbooks/deploy-rsync-based.yml \
  --vault-password-file .vault_pass

# With specific variables
ansible-playbook \
  -i inventories/production/hosts.yml \
  playbooks/deploy-rsync-based.yml \
  --vault-password-file .vault_pass \
  -e "deployment_branch=main"

Check Production Status

# Check containers
ansible web_servers \
  -i inventories/production/hosts.yml \
  -m shell -a "docker ps -a" \
  --vault-password-file .vault_pass

# Check .env configuration
ansible web_servers \
  -i inventories/production/hosts.yml \
  -m shell -a "cat /home/deploy/michaelschiemer/shared/.env.production" \
  --vault-password-file .vault_pass

# Check application logs
ansible web_servers \
  -i inventories/production/hosts.yml \
  -m shell -a "docker logs web --tail 50" \
  --vault-password-file .vault_pass

COMMANDS TO CREATE (User Requirements)

1. Simple Manual Deploy Script

#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/deploy.sh

set -e

cd "$(dirname "$0")/.."

echo "🚀 Deploying to production..."

ansible-playbook \
  -i inventories/production/hosts.yml \
  playbooks/deploy-rsync-based.yml \
  --vault-password-file .vault_pass

echo "✅ Deployment complete!"
echo "🔍 Check status: docker ps"

2. .env Update Script

#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/update-env.sh

set -e

cd "$(dirname "$0")/../.."

SOURCE_ENV="deployment/applications/environments/.env.production"
REMOTE_PATH="/home/deploy/michaelschiemer/shared/.env.production"

if [[ ! -f "$SOURCE_ENV" ]]; then
    echo "❌ Source .env.production not found at: $SOURCE_ENV"
    exit 1
fi

echo "📤 Uploading .env.production to production server..."

ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m copy \
  -a "src=$SOURCE_ENV dest=$REMOTE_PATH mode=0644" \
  --vault-password-file deployment/infrastructure/.vault_pass

echo "🔄 Restarting containers..."

ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web queue-worker" \
  --vault-password-file deployment/infrastructure/.vault_pass

echo "✅ .env.production updated and containers restarted!"

3. Quick Production Sync

#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/quick-sync.sh

set -e

cd "$(dirname "$0")/../.."

# Sync code changes (no .env update)
rsync -avz \
  --exclude '.env' \
  --exclude '.env.local' \
  --exclude 'node_modules/' \
  --exclude '.git/' \
  ./ deploy@94.16.110.151:/home/deploy/michaelschiemer/current/

# Restart containers
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web" \
  --vault-password-file deployment/infrastructure/.vault_pass

echo "✅ Quick sync complete!"

SCRIPTS TO REMOVE (Unused/Deprecated)

  1. /home/michael/dev/michaelschiemer/deploy.sh (if exists in root)

    • Reason: Conflicting with playbook-based deployment
  2. /home/michael/dev/michaelschiemer/.env.local (if exists)

    • Reason: Not used in production, causes confusion
  3. Duplicate .env files in root:

    • Keep: .env.production (source of truth for framework defaults)
    • Remove: .env.backup.*, .env.old, etc.

4. Fix Strategy (Step-by-Step)

IMMEDIATE FIX (Restore Production)

Step 1: Update Production .env.production File

cd /home/michael/dev/michaelschiemer

# Copy correct .env.production to production server
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m copy \
  -a "src=deployment/applications/environments/.env.production dest=/home/deploy/michaelschiemer/shared/.env.production mode=0644" \
  --vault-password-file deployment/infrastructure/.vault_pass

Why this file?

  • Most complete configuration (4.3K vs 2.9K)
  • Includes Redis, Queue, Mail, Monitoring configs
  • Correct DB_PORT=5432
  • Correct DB credentials

Step 2: Verify .env.production on Server

ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "grep -E '(DB_PORT|DB_USERNAME|DB_PASSWORD)' /home/deploy/michaelschiemer/shared/.env.production" \
  --vault-password-file deployment/infrastructure/.vault_pass

Expected Output:

DB_PORT=5432
DB_USERNAME=mdb_user
DB_PASSWORD=Qo2KNgGqeYksEhKr57pgugakxlothn8J

Step 3: Restart Containers

ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web queue-worker" \
  --vault-password-file deployment/infrastructure/.vault_pass

Step 4: Verify Website Functionality

# Check HTTP status
curl -I https://michaelschiemer.de

# Expected: HTTP/2 200 OK (instead of 500)

# Check container health
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "docker ps | grep -E '(web|php|queue-worker)'" \
  --vault-password-file deployment/infrastructure/.vault_pass

Expected: All containers should be "Up" and "healthy"

LONG-TERM FIX (Prevent Future Issues)

1. Update Playbook to Sync .env.production

Add task to deploy-rsync-based.yml:

# After "Synchronize project files" task, add:

- name: Sync .env.production to shared directory
  copy:
    src: "{{ local_project_path }}/deployment/applications/environments/.env.production"
    dest: "{{ project_path }}/shared/.env.production"
    mode: '0644'
  when: sync_env_to_shared | default(true)
  tags:
    - deploy
    - config

2. Create Helper Scripts

Create the 3 scripts documented in section 3:

  • scripts/deploy.sh - Simple wrapper for playbook
  • scripts/update-env.sh - Update .env.production only
  • scripts/quick-sync.sh - Quick code sync without full deployment

3. Establish Source of Truth

Decision: Use deployment/applications/environments/.env.production as source of truth

  • Most complete configuration
  • Application-specific settings
  • Includes all production services

Action: Document in README.md:

## Production Configuration

**Source of Truth**: `deployment/applications/environments/.env.production`

To update production .env:
1. Edit `deployment/applications/environments/.env.production`
2. Run `./deployment/infrastructure/scripts/update-env.sh`
3. Containers will auto-restart with new config

4. Add .env Validation

Create pre-deployment validation script:

#!/bin/bash
# scripts/validate-env.sh

ENV_FILE="deployment/applications/environments/.env.production"

echo "🔍 Validating .env.production..."

# Check required variables
REQUIRED_VARS=(
    "DB_DRIVER"
    "DB_HOST"
    "DB_PORT"
    "DB_DATABASE"
    "DB_USERNAME"
    "DB_PASSWORD"
)

for var in "${REQUIRED_VARS[@]}"; do
    if ! grep -q "^${var}=" "$ENV_FILE"; then
        echo "❌ Missing required variable: $var"
        exit 1
    fi
done

# Check PostgreSQL port
if ! grep -q "^DB_PORT=5432" "$ENV_FILE"; then
    echo "⚠️  Warning: DB_PORT should be 5432 for PostgreSQL"
fi

echo "✅ .env.production validation passed"

5. Cleanup Recommendations

Files to Remove

In Framework Root (/home/michael/dev/michaelschiemer/)

# List files to remove
find . -maxdepth 1 -name ".env.backup*" -o -name ".env.old*" -o -name ".env.local"

# Remove after confirmation
rm -f .env.backup* .env.old* .env.local

In Deployment Directory

# Check for duplicate/old deployment scripts
find deployment/ -name "deploy-old.yml" -o -name "*.backup"

On Production Server

# Clean up old releases (keep last 5)
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "cd /home/deploy/michaelschiemer/releases && ls -t | tail -n +6 | xargs rm -rf" \
  --vault-password-file deployment/infrastructure/.vault_pass

# Remove duplicate .env files in current release
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell \
  -a "cd /home/deploy/michaelschiemer/current && rm -f .env.backup* .env.old*" \
  --vault-password-file deployment/infrastructure/.vault_pass

Configuration to Keep

Essential Files:

  • /.env.production - Framework defaults (keep for reference)
  • /deployment/applications/environments/.env.production - Source of truth
  • /deployment/infrastructure/playbooks/deploy-rsync-based.yml - Main playbook
  • /deployment/infrastructure/inventories/production/hosts.yml - Inventory

Symlinks (Do Not Remove):

  • /home/deploy/michaelschiemer/current/.env.productionshared/.env.production
  • /home/deploy/michaelschiemer/current/.envshared/.env.production

6. Post-Fix Verification Checklist

# 1. Website accessible
curl -I https://michaelschiemer.de
# Expected: HTTP/2 200 OK

# 2. All containers healthy
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell -a "docker ps" \
  --vault-password-file deployment/infrastructure/.vault_pass
# Expected: All "Up" and "(healthy)"

# 3. Database connection working
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell -a "docker exec php php -r \"new PDO('pgsql:host=db;port=5432;dbname=michaelschiemer', 'mdb_user', 'Qo2KNgGqeYksEhKr57pgugakxlothn8J');\"" \
  --vault-password-file deployment/infrastructure/.vault_pass
# Expected: No errors

# 4. Application logs clean
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell -a "docker logs web --tail 20" \
  --vault-password-file deployment/infrastructure/.vault_pass
# Expected: HTTP 200 responses, no 500 errors

# 5. Queue worker stable
ansible web_servers \
  -i deployment/infrastructure/inventories/production/hosts.yml \
  -m shell -a "docker ps | grep queue-worker" \
  --vault-password-file deployment/infrastructure/.vault_pass
# Expected: "Up" status (not "Restarting")

7. Future Deployment Best Practices

  1. Always validate .env before deployment

    • Run scripts/validate-env.sh pre-deployment
    • Check DB_PORT=5432 for PostgreSQL
    • Verify credentials match database server
  2. Use playbook for all deployments

    • Consistent process
    • Automated rollback capability
    • Proper symlink management
  3. Monitor container health post-deployment

    • Check docker ps output
    • Verify all containers "(healthy)"
    • Check application logs for errors
  4. Keep .env.production in sync

    • Single source of truth: deployment/applications/environments/.env.production
    • Use update-env.sh script for updates
    • Never manually edit on production server
  5. Regular backups

    • Backup shared/.env.production before changes
    • Keep last 5 releases for quick rollback
    • Document any manual production changes

Summary

Current Status: Production broken due to DB_PORT configuration error Root Cause: Manual edits to shared/.env.production with wrong PostgreSQL port Fix Time: ~5 minutes (copy correct .env + restart containers) Prevention: Automated .env sync in playbook + validation scripts

Next Steps:

  1. Execute Step 1-4 of Fix Strategy (IMMEDIATE)
  2. Verify website returns HTTP 200
  3. Implement long-term fixes (playbook updates, scripts)
  4. Document deployment process in README.md