fix(Discovery): Add comprehensive debug logging for router initialization

- Add initializer count logging in DiscoveryServiceBootstrapper
- Add route structure analysis in RouterSetup
- Add request parameter logging in HttpRouter
- Update PHP production config for better OPcache handling
- Fix various config and error handling improvements
This commit is contained in:
2025-10-27 22:23:18 +01:00
parent e326e3d6c6
commit 70e45fb56e
56 changed files with 1519 additions and 355 deletions

View File

@@ -0,0 +1,554 @@
# Production Deployment Analysis & Fix Strategy
**Date**: 2025-10-27
**Status**: CRITICAL - Production website returning HTTP 500 errors
**Root Cause**: Database connection configuration error (DB_PORT mismatch)
---
## 1. Complete Deployment Flow Analysis
### Deployment Architecture
The project uses a **release-based deployment pattern** with shared configuration:
```
/home/deploy/michaelschiemer/
├── releases/
│ ├── 1761566515/ # Current release (timestamped)
│ ├── 1761565432/ # Previous releases
│ └── ...
├── shared/
│ └── .env.production # Shared configuration file
└── current -> releases/1761566515/ # Symlink to active release
```
**Key Characteristics**:
- **Releases Directory**: Each deployment creates a new timestamped release
- **Shared Directory**: Configuration files persist across deployments
- **Current Symlink**: Points to the active release
- **Symlink Chain**: `current/.env.production``shared/.env.production` → Used by application
### .env File Sources (3 Different Files Identified)
#### 1. Root Directory: `/home/michael/dev/michaelschiemer/.env.production`
- **Size**: 2.9K
- **Checksum**: 9f33068713432c1dc4008724dc6923b0
- **DB_PORT**: 5432 (CORRECT for PostgreSQL)
- **DB_USERNAME**: mdb_user (with underscore)
- **DB_PASSWORD**: Qo2KNgGqeYksEhKr57pgugakxlothn8J
- **Purpose**: Framework default configuration
- **Status**: CORRECT database configuration
#### 2. Deployment Directory: `/home/michael/dev/michaelschiemer/deployment/applications/environments/.env.production`
- **Size**: 4.3K
- **Checksum**: b516bf86beed813df03a30f655687b72
- **DB_PORT**: 5432 (CORRECT for PostgreSQL)
- **DB_USERNAME**: mdb_user (with underscore)
- **DB_PASSWORD**: Qo2KNgGqeYksEhKr57pgugakxlothn8J
- **Purpose**: Application-specific production configuration
- **Status**: CORRECT and MORE COMPLETE (includes Redis, Queue, Mail, Monitoring configs)
#### 3. Production Server: `/home/deploy/michaelschiemer/shared/.env.production`
- **Size**: 3.0K (modified Oct 26 20:56)
- **Line 15**: `DB_PORT=3306` (WRONG - MySQL port instead of PostgreSQL)
- **Line 67**: `DB_PORT=` (duplicate empty entry)
- **DB_USERNAME**: mdb-user (with hyphen - likely wrong)
- **DB_PASSWORD**: StartSimple2024! (different from local configs)
- **Status**: CORRUPTED - Wrong database configuration causing HTTP 500 errors
### Deployment Playbook Flow
**File**: `/home/michael/dev/michaelschiemer/deployment/infrastructure/playbooks/deploy-rsync-based.yml`
**Critical Configuration**:
```yaml
local_project_path: "{{ playbook_dir }}/../../.." # 3 dirs up = /home/michael/dev/michaelschiemer
shared_files:
- .env.production # Marked as SHARED file
rsync_excludes:
- .env
- .env.local
- .env.development
```
**Deployment Steps**:
1. **Rsync files** from `{{ local_project_path }}` (framework root) to release directory
- Excludes: `.env`, `.env.local`, `.env.development`
- Includes: `.env.production` from root directory
2. **Create release directory**: `/home/deploy/michaelschiemer/releases/{{ timestamp }}`
3. **Copy files** to release directory
4. **Create symlinks**:
- `release/.env.production``../../shared/.env.production`
- `release/.env``../../shared/.env.production`
5. **Update current** symlink → latest release
6. **Restart containers** via docker-compose
**CRITICAL ISSUE IDENTIFIED**:
The playbook does NOT have a task to initially copy `.env.production` to `shared/.env.production`. It only creates symlinks assuming the file already exists. This means:
- Initial setup requires MANUAL copy of `.env.production` to `shared/`
- Updates to `.env.production` require MANUAL sync to production server
- The rsync'd `.env.production` in release directory is IGNORED (symlink overrides it)
---
## 2. Production Server .env Status
### Current State (BROKEN)
```bash
# /home/deploy/michaelschiemer/shared/.env.production
Line 15: DB_PORT=3306 # WRONG - MySQL port (should be 5432 for PostgreSQL)
Line 67: DB_PORT= # Duplicate empty entry
DB_USERNAME=mdb-user # Wrong format (should be mdb_user with underscore)
DB_PASSWORD=StartSimple2024! # Wrong password (doesn't match local configs)
```
### Container Status
```
CONTAINER STATUS ISSUE
php Up 27 minutes (healthy) -
db Up 40 minutes (healthy) PostgreSQL running on port 5432
redis Up 40 minutes (healthy) -
web Up 40 minutes (UNHEALTHY) Nginx cannot connect to PHP due to DB error
queue-worker Restarting (1) 4s ago PHP crashing due to DB connection error
```
### Error Pattern
- **HTTP 500** on all requests (/, /impressum, etc.)
- **Root Cause**: PHP application cannot connect to database because:
1. `DB_PORT=3306` (MySQL) instead of `5432` (PostgreSQL)
2. Wrong username format (`mdb-user` vs `mdb_user`)
3. Wrong password
- **Impact**: All PHP processes fail to initialize → Nginx returns 500
---
## 3. Deployment Command Documentation
### WORKING Commands (Current Playbook)
#### Deploy via Ansible Playbook
```bash
cd /home/michael/dev/michaelschiemer/deployment/infrastructure
# Full production deployment
ansible-playbook \
-i inventories/production/hosts.yml \
playbooks/deploy-rsync-based.yml \
--vault-password-file .vault_pass
# With specific variables
ansible-playbook \
-i inventories/production/hosts.yml \
playbooks/deploy-rsync-based.yml \
--vault-password-file .vault_pass \
-e "deployment_branch=main"
```
#### Check Production Status
```bash
# Check containers
ansible web_servers \
-i inventories/production/hosts.yml \
-m shell -a "docker ps -a" \
--vault-password-file .vault_pass
# Check .env configuration
ansible web_servers \
-i inventories/production/hosts.yml \
-m shell -a "cat /home/deploy/michaelschiemer/shared/.env.production" \
--vault-password-file .vault_pass
# Check application logs
ansible web_servers \
-i inventories/production/hosts.yml \
-m shell -a "docker logs web --tail 50" \
--vault-password-file .vault_pass
```
### COMMANDS TO CREATE (User Requirements)
#### 1. Simple Manual Deploy Script
```bash
#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/deploy.sh
set -e
cd "$(dirname "$0")/.."
echo "🚀 Deploying to production..."
ansible-playbook \
-i inventories/production/hosts.yml \
playbooks/deploy-rsync-based.yml \
--vault-password-file .vault_pass
echo "✅ Deployment complete!"
echo "🔍 Check status: docker ps"
```
#### 2. .env Update Script
```bash
#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/update-env.sh
set -e
cd "$(dirname "$0")/../.."
SOURCE_ENV="deployment/applications/environments/.env.production"
REMOTE_PATH="/home/deploy/michaelschiemer/shared/.env.production"
if [[ ! -f "$SOURCE_ENV" ]]; then
echo "❌ Source .env.production not found at: $SOURCE_ENV"
exit 1
fi
echo "📤 Uploading .env.production to production server..."
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m copy \
-a "src=$SOURCE_ENV dest=$REMOTE_PATH mode=0644" \
--vault-password-file deployment/infrastructure/.vault_pass
echo "🔄 Restarting containers..."
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web queue-worker" \
--vault-password-file deployment/infrastructure/.vault_pass
echo "✅ .env.production updated and containers restarted!"
```
#### 3. Quick Production Sync
```bash
#!/bin/bash
# File: /home/michael/dev/michaelschiemer/deployment/infrastructure/scripts/quick-sync.sh
set -e
cd "$(dirname "$0")/../.."
# Sync code changes (no .env update)
rsync -avz \
--exclude '.env' \
--exclude '.env.local' \
--exclude 'node_modules/' \
--exclude '.git/' \
./ deploy@94.16.110.151:/home/deploy/michaelschiemer/current/
# Restart containers
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web" \
--vault-password-file deployment/infrastructure/.vault_pass
echo "✅ Quick sync complete!"
```
### SCRIPTS TO REMOVE (Unused/Deprecated)
1. **`/home/michael/dev/michaelschiemer/deploy.sh`** (if exists in root)
- Reason: Conflicting with playbook-based deployment
2. **`/home/michael/dev/michaelschiemer/.env.local`** (if exists)
- Reason: Not used in production, causes confusion
3. **Duplicate .env files** in root:
- Keep: `.env.production` (source of truth for framework defaults)
- Remove: `.env.backup.*`, `.env.old`, etc.
---
## 4. Fix Strategy (Step-by-Step)
### IMMEDIATE FIX (Restore Production)
#### Step 1: Update Production .env.production File
```bash
cd /home/michael/dev/michaelschiemer
# Copy correct .env.production to production server
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m copy \
-a "src=deployment/applications/environments/.env.production dest=/home/deploy/michaelschiemer/shared/.env.production mode=0644" \
--vault-password-file deployment/infrastructure/.vault_pass
```
**Why this file?**
- Most complete configuration (4.3K vs 2.9K)
- Includes Redis, Queue, Mail, Monitoring configs
- Correct DB_PORT=5432
- Correct DB credentials
#### Step 2: Verify .env.production on Server
```bash
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "grep -E '(DB_PORT|DB_USERNAME|DB_PASSWORD)' /home/deploy/michaelschiemer/shared/.env.production" \
--vault-password-file deployment/infrastructure/.vault_pass
```
**Expected Output**:
```
DB_PORT=5432
DB_USERNAME=mdb_user
DB_PASSWORD=Qo2KNgGqeYksEhKr57pgugakxlothn8J
```
#### Step 3: Restart Containers
```bash
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "cd /home/deploy/michaelschiemer/current && docker-compose restart php web queue-worker" \
--vault-password-file deployment/infrastructure/.vault_pass
```
#### Step 4: Verify Website Functionality
```bash
# Check HTTP status
curl -I https://michaelschiemer.de
# Expected: HTTP/2 200 OK (instead of 500)
# Check container health
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "docker ps | grep -E '(web|php|queue-worker)'" \
--vault-password-file deployment/infrastructure/.vault_pass
```
**Expected**: All containers should be "Up" and "healthy"
### LONG-TERM FIX (Prevent Future Issues)
#### 1. Update Playbook to Sync .env.production
Add task to `deploy-rsync-based.yml`:
```yaml
# After "Synchronize project files" task, add:
- name: Sync .env.production to shared directory
copy:
src: "{{ local_project_path }}/deployment/applications/environments/.env.production"
dest: "{{ project_path }}/shared/.env.production"
mode: '0644'
when: sync_env_to_shared | default(true)
tags:
- deploy
- config
```
#### 2. Create Helper Scripts
Create the 3 scripts documented in section 3:
- `scripts/deploy.sh` - Simple wrapper for playbook
- `scripts/update-env.sh` - Update .env.production only
- `scripts/quick-sync.sh` - Quick code sync without full deployment
#### 3. Establish Source of Truth
**Decision**: Use `deployment/applications/environments/.env.production` as source of truth
- Most complete configuration
- Application-specific settings
- Includes all production services
**Action**: Document in README.md:
```markdown
## Production Configuration
**Source of Truth**: `deployment/applications/environments/.env.production`
To update production .env:
1. Edit `deployment/applications/environments/.env.production`
2. Run `./deployment/infrastructure/scripts/update-env.sh`
3. Containers will auto-restart with new config
```
#### 4. Add .env Validation
Create pre-deployment validation script:
```bash
#!/bin/bash
# scripts/validate-env.sh
ENV_FILE="deployment/applications/environments/.env.production"
echo "🔍 Validating .env.production..."
# Check required variables
REQUIRED_VARS=(
"DB_DRIVER"
"DB_HOST"
"DB_PORT"
"DB_DATABASE"
"DB_USERNAME"
"DB_PASSWORD"
)
for var in "${REQUIRED_VARS[@]}"; do
if ! grep -q "^${var}=" "$ENV_FILE"; then
echo "❌ Missing required variable: $var"
exit 1
fi
done
# Check PostgreSQL port
if ! grep -q "^DB_PORT=5432" "$ENV_FILE"; then
echo "⚠️ Warning: DB_PORT should be 5432 for PostgreSQL"
fi
echo "✅ .env.production validation passed"
```
---
## 5. Cleanup Recommendations
### Files to Remove
#### In Framework Root (`/home/michael/dev/michaelschiemer/`)
```bash
# List files to remove
find . -maxdepth 1 -name ".env.backup*" -o -name ".env.old*" -o -name ".env.local"
# Remove after confirmation
rm -f .env.backup* .env.old* .env.local
```
#### In Deployment Directory
```bash
# Check for duplicate/old deployment scripts
find deployment/ -name "deploy-old.yml" -o -name "*.backup"
```
#### On Production Server
```bash
# Clean up old releases (keep last 5)
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "cd /home/deploy/michaelschiemer/releases && ls -t | tail -n +6 | xargs rm -rf" \
--vault-password-file deployment/infrastructure/.vault_pass
# Remove duplicate .env files in current release
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell \
-a "cd /home/deploy/michaelschiemer/current && rm -f .env.backup* .env.old*" \
--vault-password-file deployment/infrastructure/.vault_pass
```
### Configuration to Keep
**Essential Files**:
- `/.env.production` - Framework defaults (keep for reference)
- `/deployment/applications/environments/.env.production` - Source of truth
- `/deployment/infrastructure/playbooks/deploy-rsync-based.yml` - Main playbook
- `/deployment/infrastructure/inventories/production/hosts.yml` - Inventory
**Symlinks (Do Not Remove)**:
- `/home/deploy/michaelschiemer/current/.env.production``shared/.env.production`
- `/home/deploy/michaelschiemer/current/.env``shared/.env.production`
---
## 6. Post-Fix Verification Checklist
```bash
# 1. Website accessible
curl -I https://michaelschiemer.de
# Expected: HTTP/2 200 OK
# 2. All containers healthy
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell -a "docker ps" \
--vault-password-file deployment/infrastructure/.vault_pass
# Expected: All "Up" and "(healthy)"
# 3. Database connection working
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell -a "docker exec php php -r \"new PDO('pgsql:host=db;port=5432;dbname=michaelschiemer', 'mdb_user', 'Qo2KNgGqeYksEhKr57pgugakxlothn8J');\"" \
--vault-password-file deployment/infrastructure/.vault_pass
# Expected: No errors
# 4. Application logs clean
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell -a "docker logs web --tail 20" \
--vault-password-file deployment/infrastructure/.vault_pass
# Expected: HTTP 200 responses, no 500 errors
# 5. Queue worker stable
ansible web_servers \
-i deployment/infrastructure/inventories/production/hosts.yml \
-m shell -a "docker ps | grep queue-worker" \
--vault-password-file deployment/infrastructure/.vault_pass
# Expected: "Up" status (not "Restarting")
```
---
## 7. Future Deployment Best Practices
1. **Always validate .env before deployment**
- Run `scripts/validate-env.sh` pre-deployment
- Check DB_PORT=5432 for PostgreSQL
- Verify credentials match database server
2. **Use playbook for all deployments**
- Consistent process
- Automated rollback capability
- Proper symlink management
3. **Monitor container health post-deployment**
- Check `docker ps` output
- Verify all containers "(healthy)"
- Check application logs for errors
4. **Keep .env.production in sync**
- Single source of truth: `deployment/applications/environments/.env.production`
- Use `update-env.sh` script for updates
- Never manually edit on production server
5. **Regular backups**
- Backup `shared/.env.production` before changes
- Keep last 5 releases for quick rollback
- Document any manual production changes
---
## Summary
**Current Status**: Production broken due to DB_PORT configuration error
**Root Cause**: Manual edits to `shared/.env.production` with wrong PostgreSQL port
**Fix Time**: ~5 minutes (copy correct .env + restart containers)
**Prevention**: Automated .env sync in playbook + validation scripts
**Next Steps**:
1. Execute Step 1-4 of Fix Strategy (IMMEDIATE)
2. Verify website returns HTTP 200
3. Implement long-term fixes (playbook updates, scripts)
4. Document deployment process in README.md