feat: CI/CD pipeline setup complete - Ansible playbooks updated, secrets configured, workflow ready
This commit is contained in:
643
.deployment-archive-20251030-111806/README.md
Normal file
643
.deployment-archive-20251030-111806/README.md
Normal file
@@ -0,0 +1,643 @@
|
||||
# Automated Deployment System
|
||||
|
||||
Ansible-basierte Deployment-Automatisierung für das Framework.
|
||||
|
||||
## Überblick
|
||||
|
||||
Dieses System ermöglicht automatisierte Deployments direkt auf dem Produktionsserver, wodurch die problematischen SSH-Transfers von großen Docker Images elimin werden.
|
||||
|
||||
## Vorteile
|
||||
|
||||
- **Kein Image-Transfer**: Build erfolgt direkt auf dem Produktionsserver
|
||||
- **Zuverlässig**: Keine "Broken pipe" SSH-Fehler mehr
|
||||
- **Schnell**: Direkter Build nutzt Server-Ressourcen optimal
|
||||
- **Wiederholbar**: Idempotente Ansible-Playbooks
|
||||
- **Versioniert**: Alle Deployment-Konfigurationen in Git
|
||||
|
||||
## Architektur
|
||||
|
||||
### Primary: Gitea Actions (Automated CI/CD)
|
||||
|
||||
```
|
||||
Lokale Entwicklung → Git Push → Gitea
|
||||
↓
|
||||
Gitea Actions Runner (on Production)
|
||||
↓
|
||||
Build & Test & Deploy
|
||||
↓
|
||||
Docker Swarm Rolling Update
|
||||
↓
|
||||
Health Check & Auto-Rollback
|
||||
```
|
||||
|
||||
### Fallback: Manual Ansible Deployment
|
||||
|
||||
```
|
||||
Lokale Entwicklung → Manual Trigger → Ansible Playbook
|
||||
↓
|
||||
Docker Build (Server)
|
||||
↓
|
||||
Docker Swarm Update
|
||||
↓
|
||||
Health Check
|
||||
```
|
||||
|
||||
## Komponenten
|
||||
|
||||
### 1. Gitea Actions Workflow (Primary)
|
||||
|
||||
**Location**: `.gitea/workflows/deploy.yml`
|
||||
|
||||
**Trigger**: Push to `main` branch
|
||||
|
||||
**Stages**:
|
||||
1. **Checkout**: Repository auf Runner auschecken
|
||||
2. **Build**: Docker Image mit Produktions-Optimierungen bauen
|
||||
3. **Push to Registry**: Image zu lokalem Registry pushen
|
||||
4. **Deploy**: Rolling Update via Docker Swarm
|
||||
5. **Health Check**: Automatische Verfügbarkeitsprüfung (3 Versuche)
|
||||
6. **Auto-Rollback**: Bei Health Check Failure automatischer Rollback
|
||||
|
||||
**Secrets** (in Gitea konfiguriert):
|
||||
- `DOCKER_REGISTRY`: localhost:5000
|
||||
- `STACK_NAME`: framework
|
||||
- `HEALTH_CHECK_URL`: https://michaelschiemer.de/health
|
||||
|
||||
### 2. Gitea Runner Setup (Production Server)
|
||||
|
||||
**Location**: `deployment/ansible/playbooks/setup-gitea-runner.yml`
|
||||
|
||||
**Installation**:
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup-gitea-runner.yml
|
||||
```
|
||||
|
||||
**Features**:
|
||||
- Systemd Service für automatischen Start
|
||||
- Docker-in-Docker Support
|
||||
- Isolation via User `gitea-runner`
|
||||
- Logs: `journalctl -u gitea-runner -f`
|
||||
|
||||
### 3. Emergency Deployment Scripts
|
||||
|
||||
**Fallback-Szenarien** wenn Gitea Actions nicht verfügbar:
|
||||
|
||||
#### `scripts/deployment-diagnostics.sh`
|
||||
- Umfassende System-Diagnose
|
||||
- SSH, Docker Swarm, Services, Images, Networks Status
|
||||
- Health Checks und Resource Usage
|
||||
- Quick Mode: `--quick`, Verbose: `--verbose`
|
||||
|
||||
#### `scripts/service-recovery.sh`
|
||||
- Service Status Check
|
||||
- Service Restart
|
||||
- Full Recovery Procedure (5 Steps)
|
||||
- Cache Clearing
|
||||
|
||||
#### `scripts/manual-deploy-fallback.sh`
|
||||
- Manuelles Deployment ohne Gitea Actions
|
||||
- Lokaler Image Build
|
||||
- Push zu Registry
|
||||
- Ansible Deployment
|
||||
- Health Checks
|
||||
|
||||
#### `scripts/emergency-rollback.sh`
|
||||
- Schneller Rollback zu vorheriger Version
|
||||
- Listet verfügbare Image Tags
|
||||
- Direkter Rollback ohne Health Checks
|
||||
- Manuelle Verifikation erforderlich
|
||||
|
||||
### 4. Script Framework (Shared Libraries)
|
||||
|
||||
**Libraries**:
|
||||
- `scripts/lib/common.sh` - Logging, Error Handling, Utilities
|
||||
- `scripts/lib/ansible.sh` - Ansible Integration
|
||||
|
||||
**Features**:
|
||||
- Farbcodierte Logging-Funktionen (Info, Success, Warning, Error, Debug)
|
||||
- Automatische Pre-Deployment Checks
|
||||
- User-Confirmation Prompts
|
||||
- Post-Deployment Health Checks
|
||||
- Performance Metrics (Deployment Duration)
|
||||
- Retry-Logic mit exponential Backoff
|
||||
- Cleanup Handlers mit trap
|
||||
|
||||
### Ansible Konfiguration
|
||||
|
||||
- `ansible/ansible.cfg` - Ansible-Grundkonfiguration
|
||||
- `ansible/inventory/production.yml` - Produktionsserver-Inventar
|
||||
- `ansible/playbooks/deploy.yml` - Haupt-Deployment-Playbook
|
||||
|
||||
### Deployment Workflow
|
||||
|
||||
1. **Code Push**: Code-Änderungen nach Git pushen
|
||||
2. **SSH auf Server**: Auf Produktionsserver verbinden
|
||||
3. **Ansible ausführen**: Deployment-Playbook starten
|
||||
4. **Automatischer Build**: Docker Image wird auf Server gebaut
|
||||
5. **Service Update**: Docker Swarm Services werden aktualisiert
|
||||
6. **Health Check**: Automatische Verfügbarkeitsprüfung
|
||||
|
||||
## Verwendung
|
||||
|
||||
### Primary: Automated Deployment via Gitea Actions (Empfohlen)
|
||||
|
||||
Der Standard-Workflow ist vollautomatisch über Git-Push:
|
||||
|
||||
```bash
|
||||
# 1. Lokale Entwicklung abschließen
|
||||
git add .
|
||||
git commit -m "feat: new feature implementation"
|
||||
|
||||
# 2. Push to main branch triggert automatisches Deployment
|
||||
git push origin main
|
||||
|
||||
# 3. Gitea Actions führt automatisch aus:
|
||||
# - Docker Image Build (auf Production Server)
|
||||
# - Push zu lokalem Registry (localhost:5000)
|
||||
# - Docker Swarm Rolling Update
|
||||
# - Health Check (3 Versuche)
|
||||
# - Auto-Rollback bei Failure
|
||||
|
||||
# 4. Deployment Status monitoren
|
||||
# Gitea UI: https://git.michaelschiemer.de/<user>/<repo>/actions
|
||||
# Oder via SSH auf Server:
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
journalctl -u gitea-runner -f
|
||||
```
|
||||
|
||||
**Deployment-Zeit**: ~3-4 Minuten von Push bis Live
|
||||
|
||||
### Deployment Monitoring
|
||||
|
||||
```bash
|
||||
# Gitea Actions Logs (via Gitea UI)
|
||||
https://git.michaelschiemer.de/<user>/<repo>/actions
|
||||
|
||||
# Gitea Runner Logs (auf Production Server)
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
journalctl -u gitea-runner -f
|
||||
|
||||
# Service Status prüfen
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
docker stack services framework
|
||||
docker service logs framework_web --tail 50
|
||||
```
|
||||
|
||||
### Emergency/Fallback: Diagnostic & Recovery Scripts
|
||||
|
||||
Bei Problemen stehen Emergency Scripts zur Verfügung:
|
||||
|
||||
#### System-Diagnose
|
||||
```bash
|
||||
# Umfassende System-Diagnose
|
||||
./scripts/deployment-diagnostics.sh
|
||||
|
||||
# Quick Check (nur kritische Checks)
|
||||
./scripts/deployment-diagnostics.sh --quick
|
||||
|
||||
# Verbose Mode (mit Logs)
|
||||
./scripts/deployment-diagnostics.sh --verbose
|
||||
```
|
||||
|
||||
**Diagnostics umfasst**:
|
||||
- Local Environment (Git, Docker, Ansible, SSH)
|
||||
- SSH Connectivity zu Production
|
||||
- Docker Swarm Status (Manager/Worker Nodes)
|
||||
- Framework Services Status (Web, Queue-Worker)
|
||||
- Docker Images & Registry
|
||||
- Gitea Runner Service Status
|
||||
- Resource Usage (Disk, Memory, Docker)
|
||||
- Application Health Endpoints
|
||||
|
||||
#### Service Recovery
|
||||
|
||||
```bash
|
||||
# Service Status prüfen
|
||||
./scripts/service-recovery.sh status
|
||||
|
||||
# Services neu starten
|
||||
./scripts/service-recovery.sh restart
|
||||
|
||||
# Full Recovery Procedure (5 Steps)
|
||||
./scripts/service-recovery.sh recover
|
||||
|
||||
# Caches löschen
|
||||
./scripts/service-recovery.sh clear-cache
|
||||
```
|
||||
|
||||
**5-Step Recovery Procedure**:
|
||||
1. Check current status
|
||||
2. Verify Docker Swarm health (reinit if needed)
|
||||
3. Verify networks and volumes
|
||||
4. Force restart services
|
||||
5. Run health checks
|
||||
|
||||
#### Manual Deployment Fallback
|
||||
|
||||
Wenn Gitea Actions nicht verfügbar:
|
||||
|
||||
```bash
|
||||
# Manual Deployment (aktueller Branch)
|
||||
./scripts/manual-deploy-fallback.sh
|
||||
|
||||
# Manual Deployment (spezifischer Branch)
|
||||
./scripts/manual-deploy-fallback.sh feature/new-deployment
|
||||
|
||||
# Workflow:
|
||||
# 1. Prerequisites Check (Git clean, Docker, Ansible, SSH)
|
||||
# 2. Docker Image Build (lokal)
|
||||
# 3. Push zu Registry
|
||||
# 4. Ansible Deployment
|
||||
# 5. Health Checks
|
||||
```
|
||||
|
||||
#### Emergency Rollback
|
||||
|
||||
Schneller Rollback zu vorheriger Version:
|
||||
|
||||
```bash
|
||||
# Interactive Mode - wähle Version aus Liste
|
||||
./scripts/emergency-rollback.sh
|
||||
|
||||
# Liste verfügbare Versionen
|
||||
./scripts/emergency-rollback.sh list
|
||||
|
||||
# Direkt zu spezifischer Version
|
||||
./scripts/emergency-rollback.sh abc1234-1234567890
|
||||
|
||||
# Workflow:
|
||||
# 1. Zeigt aktuelle Version
|
||||
# 2. Zeigt verfügbare Image Tags
|
||||
# 3. Confirmation: Type 'ROLLBACK' to confirm
|
||||
# 4. Ansible Emergency Rollback
|
||||
# 5. Manuelle Verifikation erforderlich
|
||||
```
|
||||
|
||||
**⚠️ Wichtig**: Emergency Rollback macht KEINEN automatischen Health Check - manuelle Verifikation erforderlich!
|
||||
|
||||
### Tertiary Fallback: Direkt mit Ansible
|
||||
|
||||
Als letztes Mittel direkte Ansible-Ausführung:
|
||||
|
||||
```bash
|
||||
cd /home/michael/dev/michaelschiemer/deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml playbooks/deploy.yml
|
||||
```
|
||||
|
||||
## Konfiguration
|
||||
|
||||
### Produktionsserver
|
||||
|
||||
Server-Details in `ansible/inventory/production.yml`:
|
||||
- **Host**: 94.16.110.151
|
||||
- **User**: deploy
|
||||
- **SSH-Key**: ~/.ssh/production
|
||||
|
||||
### Gitea Actions Secrets (Primary Deployment)
|
||||
|
||||
Konfiguriert in Gitea Repository Settings → Actions → Secrets:
|
||||
|
||||
- **DOCKER_REGISTRY**: `localhost:5000` (lokaler Registry auf Production Server)
|
||||
- **STACK_NAME**: `framework` (Docker Swarm Stack Name)
|
||||
- **HEALTH_CHECK_URL**: `https://michaelschiemer.de/health` (Health Check Endpoint)
|
||||
|
||||
**Secrets hinzufügen**:
|
||||
1. Gitea UI → Repository Settings → Actions → Secrets
|
||||
2. Add Secret für jede Variable
|
||||
3. Gitea Runner muss Zugriff auf Registry haben (localhost:5000)
|
||||
|
||||
### Gitea Runner Setup (Production Server)
|
||||
|
||||
**Systemd Service**:
|
||||
```bash
|
||||
# Status prüfen
|
||||
sudo systemctl status gitea-runner
|
||||
|
||||
# Logs verfolgen
|
||||
journalctl -u gitea-runner -f
|
||||
|
||||
# Service starten/stoppen
|
||||
sudo systemctl start gitea-runner
|
||||
sudo systemctl stop gitea-runner
|
||||
```
|
||||
|
||||
**Runner-Konfiguration**:
|
||||
- **Location**: Läuft auf Production Server (94.16.110.151)
|
||||
- **User**: `gitea-runner` (isolierter Service-User)
|
||||
- **Docker Access**: Docker-in-Docker Support aktiviert
|
||||
- **Logs**: `journalctl -u gitea-runner -f`
|
||||
|
||||
**Setup via Ansible**:
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup-gitea-runner.yml
|
||||
```
|
||||
|
||||
### Docker Registry
|
||||
|
||||
**Primary Registry** (Production Server lokal):
|
||||
- **URL**: `localhost:5000` (für Runner auf Production Server)
|
||||
- **External**: `git.michaelschiemer.de:5000` (für externe Zugriffe)
|
||||
- **Image Name**: `framework`
|
||||
- **Tags**:
|
||||
- `latest` - Aktuelle Version
|
||||
- `{commit-sha}-{timestamp}` - Versionierte Images für Rollbacks
|
||||
|
||||
**Registry Access**:
|
||||
- Runner nutzt `localhost:5000` (lokaler Zugriff)
|
||||
- Manuelle Deployments nutzen `git.michaelschiemer.de:5000` (external)
|
||||
- Authentifizierung via Docker Login (falls erforderlich)
|
||||
|
||||
### Docker Swarm Stack
|
||||
|
||||
**Stack-Konfiguration**: `docker-compose.prod.yml`
|
||||
|
||||
**Services**:
|
||||
- **framework_web**: Web-Service (3 Replicas für High Availability)
|
||||
- **framework_queue-worker**: Queue-Worker (2 Replicas)
|
||||
|
||||
**Rolling Update Config**:
|
||||
```yaml
|
||||
deploy:
|
||||
replicas: 3
|
||||
update_config:
|
||||
parallelism: 1 # Ein Container pro Schritt
|
||||
delay: 10s # 10 Sekunden Pause zwischen Updates
|
||||
order: start-first # Neuer Container startet vor Stoppen des alten
|
||||
rollback_config:
|
||||
parallelism: 1
|
||||
delay: 5s
|
||||
```
|
||||
|
||||
**Stack Management**:
|
||||
```bash
|
||||
# Stack Status
|
||||
docker stack services framework
|
||||
|
||||
# Service Logs
|
||||
docker service logs framework_web --tail 50
|
||||
|
||||
# Stack Update (manuell)
|
||||
docker stack deploy -c docker-compose.prod.yml framework
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Troubleshooting-Workflow
|
||||
|
||||
Bei Problemen mit dem Deployment-System folge diesem strukturierten Workflow:
|
||||
|
||||
**Level 1: Quick Diagnostics** (Erste Anlaufstelle)
|
||||
```bash
|
||||
# Umfassende System-Diagnose
|
||||
./scripts/deployment-diagnostics.sh
|
||||
|
||||
# Quick Check (nur kritische Checks)
|
||||
./scripts/deployment-diagnostics.sh --quick
|
||||
|
||||
# Verbose Mode (mit detaillierten Logs)
|
||||
./scripts/deployment-diagnostics.sh --verbose
|
||||
```
|
||||
|
||||
**Level 2: Service Recovery** (Bei Service-Ausfällen)
|
||||
```bash
|
||||
# Service Status prüfen
|
||||
./scripts/service-recovery.sh status
|
||||
|
||||
# Services neu starten
|
||||
./scripts/service-recovery.sh restart
|
||||
|
||||
# Full Recovery Procedure (5 automatisierte Steps)
|
||||
./scripts/service-recovery.sh recover
|
||||
|
||||
# Caches löschen (bei Cache-Problemen)
|
||||
./scripts/service-recovery.sh clear-cache
|
||||
```
|
||||
|
||||
**Level 3: Manual Deployment Fallback** (Bei Gitea Actions Problemen)
|
||||
```bash
|
||||
# Manuelles Deployment (aktueller Branch)
|
||||
./scripts/manual-deploy-fallback.sh
|
||||
|
||||
# Manuelles Deployment (spezifischer Branch)
|
||||
./scripts/manual-deploy-fallback.sh feature/new-feature
|
||||
```
|
||||
|
||||
**Level 4: Emergency Rollback** (Bei kritischen Production-Problemen)
|
||||
```bash
|
||||
# Interactive Mode - Version aus Liste wählen
|
||||
./scripts/emergency-rollback.sh
|
||||
|
||||
# Verfügbare Versionen anzeigen
|
||||
./scripts/emergency-rollback.sh list
|
||||
|
||||
# Direkt zu spezifischer Version rollback
|
||||
./scripts/emergency-rollback.sh abc1234-1234567890
|
||||
```
|
||||
|
||||
### Häufige Probleme
|
||||
|
||||
#### Gitea Actions Workflow schlägt fehl
|
||||
|
||||
**Diagnose**:
|
||||
```bash
|
||||
# Gitea Runner Status prüfen (auf Production Server)
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
journalctl -u gitea-runner -f
|
||||
```
|
||||
|
||||
**Lösungen**:
|
||||
- Runner nicht aktiv: `sudo systemctl start gitea-runner`
|
||||
- Secrets fehlen: Gitea UI → Repository Settings → Actions → Secrets prüfen
|
||||
- Docker Registry nicht erreichbar: `docker login localhost:5000`
|
||||
|
||||
#### Services sind nicht erreichbar
|
||||
|
||||
**Diagnose**:
|
||||
```bash
|
||||
# Quick Health Check
|
||||
./scripts/deployment-diagnostics.sh --quick
|
||||
```
|
||||
|
||||
**Lösungen**:
|
||||
```bash
|
||||
# Services automatisch recovern
|
||||
./scripts/service-recovery.sh recover
|
||||
```
|
||||
|
||||
#### Deployment hängt oder ist langsam
|
||||
|
||||
**Diagnose**:
|
||||
```bash
|
||||
# Umfassende Diagnose mit Resource-Checks
|
||||
./scripts/deployment-diagnostics.sh --verbose
|
||||
```
|
||||
|
||||
**Lösungen**:
|
||||
- Disk Space voll: Alte Docker Images aufräumen (`docker system prune -a`)
|
||||
- Memory Issues: Services neu starten (`./scripts/service-recovery.sh restart`)
|
||||
- Netzwerk-Probleme: Docker Swarm Overlay Network prüfen
|
||||
|
||||
#### Health Checks schlagen fehl
|
||||
|
||||
**Diagnose**:
|
||||
```bash
|
||||
# Application Health direkt testen
|
||||
curl -k https://michaelschiemer.de/health
|
||||
curl -k https://michaelschiemer.de/health/database
|
||||
curl -k https://michaelschiemer.de/health/redis
|
||||
```
|
||||
|
||||
**Lösungen**:
|
||||
```bash
|
||||
# Service Logs prüfen
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
docker service logs framework_web --tail 100
|
||||
|
||||
# Caches löschen falls Health Check Cache-Issues zeigt
|
||||
./scripts/service-recovery.sh clear-cache
|
||||
```
|
||||
|
||||
#### Rollback nach fehlgeschlagenem Deployment
|
||||
|
||||
**Schneller Emergency Rollback**:
|
||||
```bash
|
||||
# 1. Verfügbare Versionen anzeigen
|
||||
./scripts/emergency-rollback.sh list
|
||||
|
||||
# 2. Zu letzter funktionierender Version rollback
|
||||
./scripts/emergency-rollback.sh <previous-tag>
|
||||
|
||||
# 3. Manuelle Verifikation
|
||||
curl -k https://michaelschiemer.de/health
|
||||
```
|
||||
|
||||
**⚠️ Wichtig**: Emergency Rollback macht KEINEN automatischen Health Check - manuelle Verifikation erforderlich!
|
||||
|
||||
## Nächste Schritte
|
||||
|
||||
### Git-Integration ✅ Completed
|
||||
|
||||
Gitea Actions CI/CD ist vollständig implementiert und operational:
|
||||
- ✅ Automatic Trigger bei Push zu main-Branch
|
||||
- ✅ Gitea Webhook Integration
|
||||
- ✅ Automated Build, Test & Deploy Pipeline
|
||||
- ✅ Health Checks mit Auto-Rollback
|
||||
|
||||
**Aktuelle Features**:
|
||||
- Zero-downtime Rolling Updates
|
||||
- Automatic Rollback bei Deployment-Failures
|
||||
- Versioned Image Tagging für manuelle Rollbacks
|
||||
- Comprehensive Emergency Recovery Scripts
|
||||
|
||||
### Monitoring (Geplante Verbesserungen)
|
||||
|
||||
**Short-Term** (1-2 Monate):
|
||||
- Deployment-Benachrichtigungen via Email/Slack
|
||||
- Prometheus/Grafana Integration für Metrics
|
||||
- Application Performance Monitoring (APM)
|
||||
- Automated Health Check Dashboards
|
||||
|
||||
**Mid-Term** (3-6 Monate):
|
||||
- Log Aggregation mit ELK/Loki Stack
|
||||
- Distributed Tracing für Microservices
|
||||
- Alerting Rules für kritische Metriken
|
||||
- Capacity Planning & Resource Forecasting
|
||||
|
||||
**Long-Term** (6-12 Monate):
|
||||
- Cost Optimization Dashboards
|
||||
- Predictive Failure Detection
|
||||
- Automated Performance Tuning
|
||||
- Multi-Region Deployment Support
|
||||
|
||||
## Sicherheit
|
||||
|
||||
### Production Security Measures
|
||||
|
||||
- **SSH-Key-basierte Authentifizierung**: Zugriff nur mit autorisiertem Private Key (~/.ssh/production)
|
||||
- **Keine Passwörter in Konfiguration**: Alle Credentials via Gitea Actions Secrets oder Docker Secrets
|
||||
- **Docker Secrets für sensitive Daten**: Database-Credentials, API-Keys, Encryption-Keys
|
||||
- **Gitea Runner Isolation**: Dedicated Service-User `gitea-runner` mit minimalen Permissions
|
||||
- **Registry Access Control**: Localhost-only Registry für zusätzliche Security
|
||||
- **HTTPS-only Communication**: Alle Deployments über verschlüsselte Verbindungen
|
||||
|
||||
### Deployment Authorization
|
||||
|
||||
- **Gitea Repository Access**: Push-Rechte erforderlich für automatisches Deployment
|
||||
- **Emergency Script Access**: SSH-Key + authorized_keys auf Production Server
|
||||
- **Manual Rollback**: Manuelle Intervention via authorized SSH-Key
|
||||
|
||||
## Performance
|
||||
|
||||
### Deployment Performance Metrics
|
||||
|
||||
- **Build-Zeit**: ~2-3 Minuten (je nach Docker Layer Caching)
|
||||
- **Registry Push**: ~30-60 Sekunden (Image Size: ~500MB)
|
||||
- **Deployment-Zeit**: ~60-90 Sekunden (Rolling Update mit 3 Replicas)
|
||||
- **Health Check Duration**: ~10-15 Sekunden (3 Retry-Attempts)
|
||||
- **Gesamt**: ~3-4 Minuten von Push bis Live (bei erfolgreichem Deployment)
|
||||
|
||||
### Rollback Performance
|
||||
|
||||
- **Automated Rollback**: ~30 Sekunden (bei Health Check Failure)
|
||||
- **Manual Emergency Rollback**: ~60 Sekunden (via emergency-rollback.sh)
|
||||
- **Service Recovery**: ~90 Sekunden (via service-recovery.sh recover)
|
||||
|
||||
### Optimizations in Place
|
||||
|
||||
- **Docker Layer Caching**: Wiederverwendung unveränderter Layer
|
||||
- **Multi-Stage Builds**: Kleinere Production Images
|
||||
- **Parallel Replica Updates**: Minimale Downtime durch start-first Strategy
|
||||
- **Local Registry**: Kein externes Network Bottleneck
|
||||
|
||||
## Support
|
||||
|
||||
### Erste Anlaufstellen bei Problemen
|
||||
|
||||
**1. Emergency Scripts nutzen** (Empfohlen):
|
||||
```bash
|
||||
# Quick Diagnostics - System-Gesundheit prüfen
|
||||
./scripts/deployment-diagnostics.sh --quick
|
||||
|
||||
# Service Recovery - Automatische Wiederherstellung
|
||||
./scripts/service-recovery.sh recover
|
||||
|
||||
# Manual Deployment - Fallback wenn Gitea Actions ausfällt
|
||||
./scripts/manual-deploy-fallback.sh
|
||||
|
||||
# Emergency Rollback - Schneller Rollback zu vorheriger Version
|
||||
./scripts/emergency-rollback.sh list
|
||||
```
|
||||
|
||||
**2. Gitea Actions Logs prüfen**:
|
||||
- Gitea UI → Repository → Actions Tab
|
||||
- Oder via SSH: `journalctl -u gitea-runner -f`
|
||||
|
||||
**3. Service Logs direkt prüfen**:
|
||||
```bash
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
docker service logs framework_web --tail 100
|
||||
docker service logs framework_queue-worker --tail 100
|
||||
```
|
||||
|
||||
**4. Docker Stack Status**:
|
||||
```bash
|
||||
ssh -i ~/.ssh/production deploy@94.16.110.151
|
||||
docker stack services framework
|
||||
docker stack ps framework --no-trunc
|
||||
```
|
||||
|
||||
### Eskalationspfad
|
||||
|
||||
1. **Level 1**: Automatische Diagnostics → `./scripts/deployment-diagnostics.sh`
|
||||
2. **Level 2**: Service Recovery → `./scripts/service-recovery.sh recover`
|
||||
3. **Level 3**: Manual Deployment → `./scripts/manual-deploy-fallback.sh`
|
||||
4. **Level 4**: Emergency Rollback → `./scripts/emergency-rollback.sh`
|
||||
5. **Level 5**: Direct Ansible → `cd deployment/ansible && ansible-playbook -i inventory/production.yml playbooks/deploy.yml`
|
||||
|
||||
### Kontakte
|
||||
|
||||
- **Production Server**: deploy@94.16.110.151 (SSH-Key erforderlich)
|
||||
- **Documentation**: `/home/michael/dev/michaelschiemer/deployment/README.md`
|
||||
- **Emergency Scripts**: `/home/michael/dev/michaelschiemer/deployment/scripts/`
|
||||
10
.deployment-archive-20251030-111806/ansible/ansible.cfg
Normal file
10
.deployment-archive-20251030-111806/ansible/ansible.cfg
Normal file
@@ -0,0 +1,10 @@
|
||||
[defaults]
|
||||
inventory = inventory
|
||||
host_key_checking = False
|
||||
retry_files_enabled = False
|
||||
roles_path = roles
|
||||
interpreter_python = auto_silent
|
||||
|
||||
[ssh_connection]
|
||||
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ServerAliveInterval=30 -o ServerAliveCountMax=3
|
||||
pipelining = True
|
||||
@@ -0,0 +1,20 @@
|
||||
all:
|
||||
vars:
|
||||
ansible_python_interpreter: /usr/bin/python3
|
||||
ansible_user: deploy
|
||||
ansible_ssh_private_key_file: ~/.ssh/production
|
||||
|
||||
production:
|
||||
hosts:
|
||||
production_server:
|
||||
ansible_host: 94.16.110.151
|
||||
docker_registry: localhost:5000
|
||||
docker_image_name: framework
|
||||
docker_image_tag: latest
|
||||
docker_swarm_stack_name: framework
|
||||
docker_services:
|
||||
- framework_web
|
||||
- framework_queue-worker
|
||||
git_repo_path: /home/deploy/framework-app
|
||||
build_dockerfile: Dockerfile.production
|
||||
build_target: production
|
||||
@@ -0,0 +1,181 @@
|
||||
---
|
||||
# Git-Based Production Deployment Playbook
|
||||
# Uses Git to sync files, builds image, and updates services
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/deploy-complete-git.yml
|
||||
|
||||
- name: Git-Based Production Deployment
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
# Calculate project root: playbook is in deployment/ansible/playbooks/, go up 3 levels
|
||||
local_project_path: "{{ playbook_dir }}/../../.."
|
||||
remote_project_path: /home/deploy/framework-app
|
||||
docker_registry: localhost:5000
|
||||
docker_image_name: framework
|
||||
docker_image_tag: latest
|
||||
docker_stack_name: framework
|
||||
build_timestamp: "{{ ansible_date_time.epoch }}"
|
||||
|
||||
tasks:
|
||||
- name: Display deployment information
|
||||
debug:
|
||||
msg:
|
||||
- "🚀 Starting Git-Based Deployment"
|
||||
- "Local Path: {{ local_project_path }}"
|
||||
- "Remote Path: {{ remote_project_path }}"
|
||||
- "Image: {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
- "Timestamp: {{ build_timestamp }}"
|
||||
|
||||
- name: Create remote project directory
|
||||
file:
|
||||
path: "{{ remote_project_path }}"
|
||||
state: directory
|
||||
mode: '0755'
|
||||
|
||||
- name: Check if Git repository exists on production
|
||||
stat:
|
||||
path: "{{ remote_project_path }}/.git"
|
||||
register: git_repo
|
||||
|
||||
- name: Initialize Git repository if not exists
|
||||
shell: |
|
||||
cd {{ remote_project_path }}
|
||||
git init
|
||||
git config user.email 'deploy@michaelschiemer.de'
|
||||
git config user.name 'Deploy User'
|
||||
when: not git_repo.stat.exists
|
||||
|
||||
- name: Create tarball of current code (excluding unnecessary files)
|
||||
delegate_to: localhost
|
||||
shell: |
|
||||
cd {{ local_project_path }}
|
||||
tar czf /tmp/framework-deploy-{{ build_timestamp }}.tar.gz \
|
||||
--exclude='.git' \
|
||||
--exclude='node_modules' \
|
||||
--exclude='vendor' \
|
||||
--exclude='storage/logs/*' \
|
||||
--exclude='storage/cache/*' \
|
||||
--exclude='.env' \
|
||||
--exclude='.env.*' \
|
||||
--exclude='tests' \
|
||||
--exclude='.deployment-backup' \
|
||||
--exclude='deployment' \
|
||||
.
|
||||
register: tarball_creation
|
||||
changed_when: true
|
||||
|
||||
- name: Transfer tarball to production
|
||||
copy:
|
||||
src: "/tmp/framework-deploy-{{ build_timestamp }}.tar.gz"
|
||||
dest: "/tmp/framework-deploy-{{ build_timestamp }}.tar.gz"
|
||||
register: tarball_transfer
|
||||
|
||||
- name: Extract tarball to production (preserving Git)
|
||||
shell: |
|
||||
cd {{ remote_project_path }}
|
||||
tar xzf /tmp/framework-deploy-{{ build_timestamp }}.tar.gz
|
||||
rm -f /tmp/framework-deploy-{{ build_timestamp }}.tar.gz
|
||||
register: extraction_result
|
||||
changed_when: true
|
||||
|
||||
- name: Commit changes to Git repository
|
||||
shell: |
|
||||
cd {{ remote_project_path }}
|
||||
git add -A
|
||||
git commit -m "Deployment {{ build_timestamp }}" || echo "No changes to commit"
|
||||
git log --oneline -5
|
||||
register: git_commit
|
||||
changed_when: true
|
||||
|
||||
- name: Display Git status
|
||||
debug:
|
||||
msg: "{{ git_commit.stdout_lines }}"
|
||||
|
||||
- name: Clean up local tarball
|
||||
delegate_to: localhost
|
||||
file:
|
||||
path: "/tmp/framework-deploy-{{ build_timestamp }}.tar.gz"
|
||||
state: absent
|
||||
|
||||
- name: Build Docker image on production server
|
||||
shell: |
|
||||
cd {{ remote_project_path }}
|
||||
docker build \
|
||||
-f docker/php/Dockerfile \
|
||||
--target production \
|
||||
-t {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
-t {{ docker_registry }}/{{ docker_image_name }}:{{ build_timestamp }} \
|
||||
--no-cache \
|
||||
--progress=plain \
|
||||
.
|
||||
register: build_result
|
||||
changed_when: true
|
||||
|
||||
- name: Display build output (last 20 lines)
|
||||
debug:
|
||||
msg: "{{ build_result.stdout_lines[-20:] }}"
|
||||
|
||||
- name: Update web service with rolling update
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
--force \
|
||||
--update-parallelism 1 \
|
||||
--update-delay 10s \
|
||||
{{ docker_stack_name }}_web
|
||||
register: web_update
|
||||
changed_when: true
|
||||
|
||||
- name: Update queue-worker service
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
--force \
|
||||
{{ docker_stack_name }}_queue-worker
|
||||
register: worker_update
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for services to stabilize (30 seconds)
|
||||
pause:
|
||||
seconds: 30
|
||||
prompt: "Waiting for services to stabilize..."
|
||||
|
||||
- name: Check service status
|
||||
shell: docker stack services {{ docker_stack_name }} --format "table {{`{{.Name}}\t{{.Replicas}}\t{{.Image}}`}}"
|
||||
register: service_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check website availability
|
||||
shell: curl -k -s -o /dev/null -w '%{http_code}' https://michaelschiemer.de/
|
||||
register: website_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get recent web service logs
|
||||
shell: docker service logs {{ docker_stack_name }}_web --tail 10 --no-trunc 2>&1 | tail -20
|
||||
register: web_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display deployment summary
|
||||
debug:
|
||||
msg:
|
||||
- "✅ Git-Based Deployment Completed"
|
||||
- ""
|
||||
- "Build Timestamp: {{ build_timestamp }}"
|
||||
- "Image: {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
- ""
|
||||
- "Git Commit Info:"
|
||||
- "{{ git_commit.stdout_lines }}"
|
||||
- ""
|
||||
- "Service Status:"
|
||||
- "{{ service_status.stdout_lines }}"
|
||||
- ""
|
||||
- "Website HTTP Status: {{ website_check.stdout }}"
|
||||
- ""
|
||||
- "Recent Logs:"
|
||||
- "{{ web_logs.stdout_lines }}"
|
||||
- ""
|
||||
- "🌐 Website: https://michaelschiemer.de"
|
||||
- "📊 Portainer: https://michaelschiemer.de:9000"
|
||||
- "📈 Grafana: https://michaelschiemer.de:3000"
|
||||
@@ -0,0 +1,135 @@
|
||||
---
|
||||
# Complete Production Deployment Playbook
|
||||
# Syncs files, builds image, and updates services
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/deploy-complete.yml
|
||||
|
||||
- name: Complete Production Deployment
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
# Calculate project root: playbook is in deployment/ansible/playbooks/, go up 3 levels
|
||||
local_project_path: "{{ playbook_dir }}/../../.."
|
||||
remote_project_path: /home/deploy/framework-app
|
||||
docker_registry: localhost:5000
|
||||
docker_image_name: framework
|
||||
docker_image_tag: latest
|
||||
docker_stack_name: framework
|
||||
build_timestamp: "{{ ansible_date_time.epoch }}"
|
||||
|
||||
tasks:
|
||||
- name: Display deployment information
|
||||
debug:
|
||||
msg:
|
||||
- "🚀 Starting Complete Deployment"
|
||||
- "Local Path: {{ local_project_path }}"
|
||||
- "Remote Path: {{ remote_project_path }}"
|
||||
- "Image: {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
- "Timestamp: {{ build_timestamp }}"
|
||||
|
||||
- name: Create remote project directory
|
||||
file:
|
||||
path: "{{ remote_project_path }}"
|
||||
state: directory
|
||||
mode: '0755'
|
||||
|
||||
- name: Sync project files to production server
|
||||
synchronize:
|
||||
src: "{{ local_project_path }}/"
|
||||
dest: "{{ remote_project_path }}/"
|
||||
delete: no
|
||||
rsync_opts:
|
||||
- "--exclude=.git"
|
||||
- "--exclude=.gitignore"
|
||||
- "--exclude=node_modules"
|
||||
- "--exclude=vendor"
|
||||
- "--exclude=storage/logs/*"
|
||||
- "--exclude=storage/cache/*"
|
||||
- "--exclude=.env"
|
||||
- "--exclude=.env.*"
|
||||
- "--exclude=tests"
|
||||
- "--exclude=.deployment-backup"
|
||||
- "--exclude=deployment"
|
||||
register: sync_result
|
||||
|
||||
- name: Display sync results
|
||||
debug:
|
||||
msg: "Files synced: {{ sync_result.changed }}"
|
||||
|
||||
- name: Build Docker image on production server
|
||||
shell: |
|
||||
cd {{ remote_project_path }}
|
||||
docker build \
|
||||
-f Dockerfile.production \
|
||||
-t {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
-t {{ docker_registry }}/{{ docker_image_name }}:{{ build_timestamp }} \
|
||||
--no-cache \
|
||||
--progress=plain \
|
||||
.
|
||||
register: build_result
|
||||
changed_when: true
|
||||
|
||||
- name: Display build output (last 20 lines)
|
||||
debug:
|
||||
msg: "{{ build_result.stdout_lines[-20:] }}"
|
||||
|
||||
- name: Update web service with rolling update
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
--force \
|
||||
--update-parallelism 1 \
|
||||
--update-delay 10s \
|
||||
{{ docker_stack_name }}_web
|
||||
register: web_update
|
||||
changed_when: true
|
||||
|
||||
- name: Update queue-worker service
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }} \
|
||||
--force \
|
||||
{{ docker_stack_name }}_queue-worker
|
||||
register: worker_update
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for services to stabilize (30 seconds)
|
||||
pause:
|
||||
seconds: 30
|
||||
prompt: "Waiting for services to stabilize..."
|
||||
|
||||
- name: Check service status
|
||||
shell: docker stack services {{ docker_stack_name }} --format "table {{`{{.Name}}\t{{.Replicas}}\t{{.Image}}`}}"
|
||||
register: service_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check website availability
|
||||
shell: curl -k -s -o /dev/null -w '%{http_code}' https://michaelschiemer.de/
|
||||
register: website_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get recent web service logs
|
||||
shell: docker service logs {{ docker_stack_name }}_web --tail 10 --no-trunc 2>&1 | tail -20
|
||||
register: web_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display deployment summary
|
||||
debug:
|
||||
msg:
|
||||
- "✅ Deployment Completed"
|
||||
- ""
|
||||
- "Build Timestamp: {{ build_timestamp }}"
|
||||
- "Image: {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
- ""
|
||||
- "Service Status:"
|
||||
- "{{ service_status.stdout_lines }}"
|
||||
- ""
|
||||
- "Website HTTP Status: {{ website_check.stdout }}"
|
||||
- ""
|
||||
- "Recent Logs:"
|
||||
- "{{ web_logs.stdout_lines }}"
|
||||
- ""
|
||||
- "🌐 Website: https://michaelschiemer.de"
|
||||
- "📊 Portainer: https://michaelschiemer.de:9000"
|
||||
- "📈 Grafana: https://michaelschiemer.de:3000"
|
||||
@@ -0,0 +1,120 @@
|
||||
---
|
||||
# Ansible Playbook: Update Production Deployment
|
||||
# Purpose: Pull new Docker image and update services with zero-downtime
|
||||
# Usage: Called by Gitea Actions or manual deployment
|
||||
|
||||
- name: Update Production Services with New Image
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
image_tag: "{{ image_tag | default('latest') }}"
|
||||
git_commit_sha: "{{ git_commit_sha | default('unknown') }}"
|
||||
deployment_timestamp: "{{ deployment_timestamp | default(ansible_date_time.iso8601) }}"
|
||||
registry_url: "git.michaelschiemer.de:5000"
|
||||
image_name: "framework"
|
||||
stack_name: "framework"
|
||||
|
||||
tasks:
|
||||
- name: Log deployment start
|
||||
debug:
|
||||
msg: |
|
||||
🚀 Starting deployment
|
||||
Image: {{ registry_url }}/{{ image_name }}:{{ image_tag }}
|
||||
Commit: {{ git_commit_sha }}
|
||||
Time: {{ deployment_timestamp }}
|
||||
|
||||
- name: Pull new Docker image
|
||||
docker_image:
|
||||
name: "{{ registry_url }}/{{ image_name }}"
|
||||
tag: "{{ image_tag }}"
|
||||
source: pull
|
||||
force_source: yes
|
||||
register: image_pull
|
||||
retries: 3
|
||||
delay: 5
|
||||
until: image_pull is succeeded
|
||||
|
||||
- name: Tag image as latest locally
|
||||
docker_image:
|
||||
name: "{{ registry_url }}/{{ image_name }}:{{ image_tag }}"
|
||||
repository: "{{ registry_url }}/{{ image_name }}"
|
||||
tag: latest
|
||||
source: local
|
||||
|
||||
- name: Update web service with rolling update
|
||||
docker_swarm_service:
|
||||
name: "{{ stack_name }}_web"
|
||||
image: "{{ registry_url }}/{{ image_name }}:{{ image_tag }}"
|
||||
force_update: yes
|
||||
update_config:
|
||||
parallelism: 1
|
||||
delay: 10s
|
||||
failure_action: rollback
|
||||
monitor: 30s
|
||||
max_failure_ratio: 0.3
|
||||
rollback_config:
|
||||
parallelism: 1
|
||||
delay: 5s
|
||||
state: present
|
||||
register: web_update
|
||||
|
||||
- name: Update queue-worker service
|
||||
docker_swarm_service:
|
||||
name: "{{ stack_name }}_queue-worker"
|
||||
image: "{{ registry_url }}/{{ image_name }}:{{ image_tag }}"
|
||||
force_update: yes
|
||||
update_config:
|
||||
parallelism: 1
|
||||
delay: 10s
|
||||
failure_action: rollback
|
||||
state: present
|
||||
register: worker_update
|
||||
|
||||
- name: Wait for services to stabilize
|
||||
pause:
|
||||
seconds: 20
|
||||
|
||||
- name: Verify service status
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_web --filter "desired-state=running" --format "{{`{{.CurrentState}}`}}" | head -1
|
||||
register: service_state
|
||||
changed_when: false
|
||||
|
||||
- name: Check if deployment succeeded
|
||||
fail:
|
||||
msg: "Service deployment failed: {{ service_state.stdout }}"
|
||||
when: "'Running' not in service_state.stdout"
|
||||
|
||||
- name: Get running replicas count
|
||||
shell: |
|
||||
docker service ls --filter "name={{ stack_name }}_web" --format "{{`{{.Replicas}}`}}"
|
||||
register: replicas
|
||||
changed_when: false
|
||||
|
||||
- name: Record deployment in history
|
||||
copy:
|
||||
content: |
|
||||
Deployment: {{ deployment_timestamp }}
|
||||
Image: {{ registry_url }}/{{ image_name }}:{{ image_tag }}
|
||||
Commit: {{ git_commit_sha }}
|
||||
Status: SUCCESS
|
||||
Replicas: {{ replicas.stdout }}
|
||||
dest: "/home/deploy/deployments/{{ image_tag }}.log"
|
||||
mode: '0644'
|
||||
|
||||
- name: Display deployment summary
|
||||
debug:
|
||||
msg: |
|
||||
✅ Deployment completed successfully
|
||||
|
||||
Image: {{ registry_url }}/{{ image_name }}:{{ image_tag }}
|
||||
Commit: {{ git_commit_sha }}
|
||||
Web Service: {{ web_update.changed | ternary('UPDATED', 'NO CHANGE') }}
|
||||
Worker Service: {{ worker_update.changed | ternary('UPDATED', 'NO CHANGE') }}
|
||||
Replicas: {{ replicas.stdout }}
|
||||
Time: {{ deployment_timestamp }}
|
||||
|
||||
handlers:
|
||||
- name: Cleanup old images
|
||||
shell: docker image prune -af --filter "until=72h"
|
||||
changed_when: false
|
||||
@@ -0,0 +1,90 @@
|
||||
---
|
||||
- name: Deploy Framework Application to Production
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
git_repo_url: "{{ lookup('env', 'GIT_REPO_URL') | default('') }}"
|
||||
build_timestamp: "{{ ansible_date_time.epoch }}"
|
||||
|
||||
tasks:
|
||||
- name: Ensure git repo path exists
|
||||
file:
|
||||
path: "{{ git_repo_path }}"
|
||||
state: directory
|
||||
mode: '0755'
|
||||
|
||||
- name: Pull latest code from git
|
||||
git:
|
||||
repo: "{{ git_repo_url }}"
|
||||
dest: "{{ git_repo_path }}"
|
||||
version: main
|
||||
force: yes
|
||||
when: git_repo_url != ''
|
||||
register: git_pull_result
|
||||
|
||||
- name: Build Docker image on production server
|
||||
docker_image:
|
||||
name: "{{ docker_registry }}/{{ docker_image_name }}"
|
||||
tag: "{{ docker_image_tag }}"
|
||||
build:
|
||||
path: "{{ git_repo_path }}"
|
||||
dockerfile: "{{ build_dockerfile }}"
|
||||
args:
|
||||
--target: "{{ build_target }}"
|
||||
source: build
|
||||
force_source: yes
|
||||
push: no
|
||||
register: build_result
|
||||
|
||||
- name: Tag image with timestamp for rollback capability
|
||||
docker_image:
|
||||
name: "{{ docker_registry }}/{{ docker_image_name }}"
|
||||
repository: "{{ docker_registry }}/{{ docker_image_name }}"
|
||||
tag: "{{ build_timestamp }}"
|
||||
source: local
|
||||
|
||||
- name: Update Docker Swarm service - web
|
||||
docker_swarm_service:
|
||||
name: "{{ docker_swarm_stack_name }}_web"
|
||||
image: "{{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
force_update: yes
|
||||
state: present
|
||||
register: web_update_result
|
||||
|
||||
- name: Update Docker Swarm service - queue-worker
|
||||
docker_swarm_service:
|
||||
name: "{{ docker_swarm_stack_name }}_queue-worker"
|
||||
image: "{{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
force_update: yes
|
||||
state: present
|
||||
register: worker_update_result
|
||||
|
||||
- name: Wait for services to stabilize
|
||||
pause:
|
||||
seconds: 60
|
||||
|
||||
- name: Check service status
|
||||
shell: docker stack services {{ docker_swarm_stack_name }} | grep -E "NAME|{{ docker_swarm_stack_name }}"
|
||||
register: service_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display deployment results
|
||||
debug:
|
||||
msg:
|
||||
- "Deployment completed successfully"
|
||||
- "Build timestamp: {{ build_timestamp }}"
|
||||
- "Image: {{ docker_registry }}/{{ docker_image_name }}:{{ docker_image_tag }}"
|
||||
- "Services status: {{ service_status.stdout_lines }}"
|
||||
|
||||
- name: Test website availability
|
||||
uri:
|
||||
url: "https://michaelschiemer.de/"
|
||||
validate_certs: no
|
||||
status_code: [200, 302]
|
||||
timeout: 10
|
||||
register: website_health
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Display website health check
|
||||
debug:
|
||||
msg: "Website responded with status: {{ website_health.status | default('FAILED') }}"
|
||||
@@ -0,0 +1,110 @@
|
||||
---
|
||||
# Ansible Playbook: Emergency Rollback
|
||||
# Purpose: Fast rollback without health checks for emergency situations
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/emergency-rollback.yml -e "rollback_tag=<tag>"
|
||||
|
||||
- name: Emergency Rollback (Fast Mode)
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
registry_url: "git.michaelschiemer.de:5000"
|
||||
image_name: "framework"
|
||||
stack_name: "framework"
|
||||
rollback_tag: "{{ rollback_tag | default('latest') }}"
|
||||
skip_health_check: true
|
||||
|
||||
pre_tasks:
|
||||
- name: Emergency rollback warning
|
||||
debug:
|
||||
msg: |
|
||||
🚨 EMERGENCY ROLLBACK IN PROGRESS 🚨
|
||||
|
||||
This will immediately revert to: {{ rollback_tag }}
|
||||
Health checks will be SKIPPED for speed.
|
||||
|
||||
Press Ctrl+C now if you want to abort.
|
||||
|
||||
- name: Record rollback initiation
|
||||
shell: |
|
||||
echo "[$(date)] Emergency rollback initiated to {{ rollback_tag }}" >> /home/deploy/deployments/emergency-rollback.log
|
||||
|
||||
tasks:
|
||||
- name: Get current running image tag
|
||||
shell: |
|
||||
docker service inspect {{ stack_name }}_web --format '{{`{{.Spec.TaskTemplate.ContainerSpec.Image}}`}}'
|
||||
register: current_image
|
||||
changed_when: false
|
||||
|
||||
- name: Display current vs target
|
||||
debug:
|
||||
msg: |
|
||||
Current: {{ current_image.stdout }}
|
||||
Target: {{ registry_url }}/{{ image_name }}:{{ rollback_tag }}
|
||||
|
||||
- name: Pull rollback image (skip verification)
|
||||
docker_image:
|
||||
name: "{{ registry_url }}/{{ image_name }}"
|
||||
tag: "{{ rollback_tag }}"
|
||||
source: pull
|
||||
register: rollback_image
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Force rollback even if image pull failed
|
||||
debug:
|
||||
msg: "⚠️ Image pull failed, attempting rollback with cached image"
|
||||
when: rollback_image is failed
|
||||
|
||||
- name: Immediate rollback - web service
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ registry_url }}/{{ image_name }}:{{ rollback_tag }} \
|
||||
--force \
|
||||
--update-parallelism 999 \
|
||||
--update-delay 0s \
|
||||
{{ stack_name }}_web
|
||||
register: web_rollback
|
||||
|
||||
- name: Immediate rollback - queue-worker service
|
||||
shell: |
|
||||
docker service update \
|
||||
--image {{ registry_url }}/{{ image_name }}:{{ rollback_tag }} \
|
||||
--force \
|
||||
--update-parallelism 999 \
|
||||
--update-delay 0s \
|
||||
{{ stack_name }}_queue-worker
|
||||
register: worker_rollback
|
||||
|
||||
- name: Wait for rollback to propagate (minimal wait)
|
||||
pause:
|
||||
seconds: 15
|
||||
|
||||
- name: Quick service status check
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_web --filter "desired-state=running" --format "{{`{{.CurrentState}}`}}" | head -1
|
||||
register: rollback_state
|
||||
changed_when: false
|
||||
|
||||
- name: Display rollback status
|
||||
debug:
|
||||
msg: |
|
||||
🚨 Emergency rollback completed (fast mode)
|
||||
|
||||
Web Service: {{ web_rollback.changed | ternary('ROLLED BACK', 'NO CHANGE') }}
|
||||
Worker Service: {{ worker_rollback.changed | ternary('ROLLED BACK', 'NO CHANGE') }}
|
||||
Service State: {{ rollback_state.stdout }}
|
||||
|
||||
⚠️ MANUAL VERIFICATION REQUIRED:
|
||||
1. Check application: https://michaelschiemer.de
|
||||
2. Check service logs: docker service logs {{ stack_name }}_web
|
||||
3. Verify database connectivity
|
||||
4. Run full health check: ansible-playbook playbooks/health-check.yml
|
||||
|
||||
- name: Record rollback completion
|
||||
shell: |
|
||||
echo "[$(date)] Emergency rollback completed: {{ rollback_tag }}, Status: {{ rollback_state.stdout }}" >> /home/deploy/deployments/emergency-rollback.log
|
||||
|
||||
- name: Alert - manual verification required
|
||||
debug:
|
||||
msg: |
|
||||
⚠️ IMPORTANT: This was an emergency rollback without health checks.
|
||||
You MUST manually verify application functionality before considering this successful.
|
||||
@@ -0,0 +1,140 @@
|
||||
---
|
||||
# Ansible Playbook: Production Health Check
|
||||
# Purpose: Comprehensive health verification for production deployment
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/health-check.yml
|
||||
|
||||
- name: Production Health Check
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
app_url: "https://michaelschiemer.de"
|
||||
stack_name: "framework"
|
||||
health_timeout: 30
|
||||
max_retries: 10
|
||||
|
||||
tasks:
|
||||
- name: Check Docker Swarm status
|
||||
shell: docker info | grep "Swarm: active"
|
||||
register: swarm_status
|
||||
failed_when: swarm_status.rc != 0
|
||||
changed_when: false
|
||||
|
||||
- name: Check running services
|
||||
shell: docker service ls --filter "name={{ stack_name }}" --format "{{`{{.Name}}`}} {{`{{.Replicas}}`}}"
|
||||
register: service_list
|
||||
changed_when: false
|
||||
|
||||
- name: Display service status
|
||||
debug:
|
||||
msg: "{{ service_list.stdout_lines }}"
|
||||
|
||||
- name: Verify web service is running
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_web \
|
||||
--filter "desired-state=running" \
|
||||
--format "{{`{{.CurrentState}}`}}" | head -1
|
||||
register: web_state
|
||||
changed_when: false
|
||||
|
||||
- name: Fail if web service not running
|
||||
fail:
|
||||
msg: "Web service is not in Running state: {{ web_state.stdout }}"
|
||||
when: "'Running' not in web_state.stdout"
|
||||
|
||||
- name: Verify worker service is running
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_queue-worker \
|
||||
--filter "desired-state=running" \
|
||||
--format "{{`{{.CurrentState}}`}}" | head -1
|
||||
register: worker_state
|
||||
changed_when: false
|
||||
|
||||
- name: Fail if worker service not running
|
||||
fail:
|
||||
msg: "Worker service is not in Running state: {{ worker_state.stdout }}"
|
||||
when: "'Running' not in worker_state.stdout"
|
||||
|
||||
- name: Wait for application to be ready
|
||||
uri:
|
||||
url: "{{ app_url }}/health"
|
||||
validate_certs: no
|
||||
status_code: [200, 302]
|
||||
timeout: "{{ health_timeout }}"
|
||||
register: health_response
|
||||
retries: "{{ max_retries }}"
|
||||
delay: 3
|
||||
until: health_response.status in [200, 302]
|
||||
|
||||
- name: Check database connectivity
|
||||
uri:
|
||||
url: "{{ app_url }}/health/database"
|
||||
validate_certs: no
|
||||
status_code: 200
|
||||
timeout: "{{ health_timeout }}"
|
||||
register: db_health
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Check Redis connectivity
|
||||
uri:
|
||||
url: "{{ app_url }}/health/redis"
|
||||
validate_certs: no
|
||||
status_code: 200
|
||||
timeout: "{{ health_timeout }}"
|
||||
register: redis_health
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Check queue system
|
||||
uri:
|
||||
url: "{{ app_url }}/health/queue"
|
||||
validate_certs: no
|
||||
status_code: 200
|
||||
timeout: "{{ health_timeout }}"
|
||||
register: queue_health
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Get service replicas count
|
||||
shell: |
|
||||
docker service ls --filter "name={{ stack_name }}_web" --format "{{`{{.Replicas}}`}}"
|
||||
register: replicas
|
||||
changed_when: false
|
||||
|
||||
- name: Check for service errors
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_web --filter "desired-state=running" | grep -c Error || true
|
||||
register: error_count
|
||||
changed_when: false
|
||||
|
||||
- name: Warn if errors detected
|
||||
debug:
|
||||
msg: "⚠️ Warning: {{ error_count.stdout }} errors detected in service logs"
|
||||
when: error_count.stdout | int > 0
|
||||
|
||||
- name: Display health check summary
|
||||
debug:
|
||||
msg: |
|
||||
✅ Health Check Summary:
|
||||
|
||||
Services:
|
||||
- Web Service: {{ web_state.stdout }}
|
||||
- Worker Service: {{ worker_state.stdout }}
|
||||
- Replicas: {{ replicas.stdout }}
|
||||
|
||||
Endpoints:
|
||||
- Application: {{ health_response.status }}
|
||||
- Database: {{ db_health.status | default('SKIPPED') }}
|
||||
- Redis: {{ redis_health.status | default('SKIPPED') }}
|
||||
- Queue: {{ queue_health.status | default('SKIPPED') }}
|
||||
|
||||
Errors: {{ error_count.stdout }}
|
||||
|
||||
- name: Overall health assessment
|
||||
debug:
|
||||
msg: "✅ All health checks PASSED"
|
||||
when:
|
||||
- health_response.status in [200, 302]
|
||||
- error_count.stdout | int == 0
|
||||
|
||||
- name: Fail if critical health checks failed
|
||||
fail:
|
||||
msg: "❌ Health check FAILED - manual intervention required"
|
||||
when: health_response.status not in [200, 302]
|
||||
@@ -0,0 +1,123 @@
|
||||
---
|
||||
# Ansible Playbook: Emergency Rollback
|
||||
# Purpose: Rollback to previous working deployment
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/rollback.yml
|
||||
|
||||
- name: Rollback Production Deployment
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars:
|
||||
registry_url: "git.michaelschiemer.de:5000"
|
||||
image_name: "framework"
|
||||
stack_name: "framework"
|
||||
rollback_tag: "{{ rollback_tag | default('latest') }}"
|
||||
|
||||
tasks:
|
||||
- name: Display rollback warning
|
||||
debug:
|
||||
msg: |
|
||||
⚠️ ROLLBACK IN PROGRESS
|
||||
|
||||
This will revert services to a previous image.
|
||||
Current target: {{ rollback_tag }}
|
||||
|
||||
- name: Pause for confirmation (manual runs only)
|
||||
pause:
|
||||
prompt: "Press ENTER to continue with rollback, or Ctrl+C to abort"
|
||||
when: ansible_check_mode is not defined
|
||||
|
||||
- name: Get list of available image tags
|
||||
shell: |
|
||||
docker images {{ registry_url }}/{{ image_name }} --format "{{`{{.Tag}}`}}" | grep -v buildcache | head -10
|
||||
register: available_tags
|
||||
changed_when: false
|
||||
|
||||
- name: Display available tags
|
||||
debug:
|
||||
msg: |
|
||||
Available image tags for rollback:
|
||||
{{ available_tags.stdout_lines | join('\n') }}
|
||||
|
||||
- name: Verify rollback image exists
|
||||
docker_image:
|
||||
name: "{{ registry_url }}/{{ image_name }}"
|
||||
tag: "{{ rollback_tag }}"
|
||||
source: pull
|
||||
register: rollback_image
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Fail if image doesn't exist
|
||||
fail:
|
||||
msg: "Rollback image {{ registry_url }}/{{ image_name }}:{{ rollback_tag }} not found"
|
||||
when: rollback_image is failed
|
||||
|
||||
- name: Rollback web service
|
||||
docker_swarm_service:
|
||||
name: "{{ stack_name }}_web"
|
||||
image: "{{ registry_url }}/{{ image_name }}:{{ rollback_tag }}"
|
||||
force_update: yes
|
||||
update_config:
|
||||
parallelism: 2
|
||||
delay: 5s
|
||||
state: present
|
||||
register: web_rollback
|
||||
|
||||
- name: Rollback queue-worker service
|
||||
docker_swarm_service:
|
||||
name: "{{ stack_name }}_queue-worker"
|
||||
image: "{{ registry_url }}/{{ image_name }}:{{ rollback_tag }}"
|
||||
force_update: yes
|
||||
update_config:
|
||||
parallelism: 1
|
||||
delay: 5s
|
||||
state: present
|
||||
register: worker_rollback
|
||||
|
||||
- name: Wait for rollback to complete
|
||||
pause:
|
||||
seconds: 30
|
||||
|
||||
- name: Verify rollback success
|
||||
shell: |
|
||||
docker service ps {{ stack_name }}_web --filter "desired-state=running" --format "{{`{{.CurrentState}}`}}" | head -1
|
||||
register: rollback_state
|
||||
changed_when: false
|
||||
|
||||
- name: Test service health
|
||||
uri:
|
||||
url: "https://michaelschiemer.de/health"
|
||||
validate_certs: no
|
||||
status_code: [200, 302]
|
||||
timeout: 10
|
||||
register: health_check
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Record rollback in history
|
||||
copy:
|
||||
content: |
|
||||
Rollback: {{ ansible_date_time.iso8601 }}
|
||||
Previous Image: {{ registry_url }}/{{ image_name }}:latest
|
||||
Rollback Image: {{ registry_url }}/{{ image_name }}:{{ rollback_tag }}
|
||||
Status: {{ health_check.status | default('UNKNOWN') }}
|
||||
Reason: Manual rollback or deployment failure
|
||||
dest: "/home/deploy/deployments/rollback-{{ ansible_date_time.epoch }}.log"
|
||||
mode: '0644'
|
||||
|
||||
- name: Display rollback summary
|
||||
debug:
|
||||
msg: |
|
||||
{% if health_check is succeeded %}
|
||||
✅ Rollback completed successfully
|
||||
{% else %}
|
||||
❌ Rollback completed but health check failed
|
||||
{% endif %}
|
||||
|
||||
Image: {{ registry_url }}/{{ image_name }}:{{ rollback_tag }}
|
||||
Web Service: {{ web_rollback.changed | ternary('ROLLED BACK', 'NO CHANGE') }}
|
||||
Worker Service: {{ worker_rollback.changed | ternary('ROLLED BACK', 'NO CHANGE') }}
|
||||
Health Status: {{ health_check.status | default('FAILED') }}
|
||||
|
||||
- name: Alert if rollback failed
|
||||
fail:
|
||||
msg: "Rollback completed but health check failed. Manual intervention required."
|
||||
when: health_check is failed
|
||||
@@ -0,0 +1,116 @@
|
||||
---
|
||||
# Ansible Playbook: Setup Gitea Actions Runner on Production Server
|
||||
# Purpose: Install and configure Gitea Actions runner for automated deployments
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/setup-gitea-runner.yml
|
||||
|
||||
- name: Setup Gitea Actions Runner for Production Deployments
|
||||
hosts: production_server
|
||||
become: yes
|
||||
vars:
|
||||
gitea_url: "https://git.michaelschiemer.de"
|
||||
runner_name: "production-runner"
|
||||
runner_labels: "docker,production,ubuntu"
|
||||
runner_version: "0.2.6"
|
||||
runner_install_dir: "/opt/gitea-runner"
|
||||
runner_work_dir: "/home/deploy/gitea-runner-work"
|
||||
runner_user: "deploy"
|
||||
|
||||
tasks:
|
||||
- name: Create runner directories
|
||||
file:
|
||||
path: "{{ item }}"
|
||||
state: directory
|
||||
owner: "{{ runner_user }}"
|
||||
group: "{{ runner_user }}"
|
||||
mode: '0755'
|
||||
loop:
|
||||
- "{{ runner_install_dir }}"
|
||||
- "{{ runner_work_dir }}"
|
||||
|
||||
- name: Download Gitea Act Runner binary
|
||||
get_url:
|
||||
url: "https://dl.gitea.com/act_runner/{{ runner_version }}/act_runner-{{ runner_version }}-linux-amd64"
|
||||
dest: "{{ runner_install_dir }}/act_runner"
|
||||
mode: '0755'
|
||||
owner: "{{ runner_user }}"
|
||||
|
||||
- name: Check if runner is already registered
|
||||
stat:
|
||||
path: "{{ runner_install_dir }}/.runner"
|
||||
register: runner_config
|
||||
|
||||
- name: Register runner with Gitea (manual step required)
|
||||
debug:
|
||||
msg: |
|
||||
⚠️ MANUAL STEP REQUIRED:
|
||||
|
||||
1. Generate registration token in Gitea:
|
||||
- Navigate to {{ gitea_url }}/admin/runners
|
||||
- Click "Create new runner"
|
||||
- Copy the registration token
|
||||
|
||||
2. SSH to production server and run:
|
||||
sudo -u {{ runner_user }} {{ runner_install_dir }}/act_runner register \
|
||||
--instance {{ gitea_url }} \
|
||||
--token YOUR_REGISTRATION_TOKEN \
|
||||
--name {{ runner_name }} \
|
||||
--labels {{ runner_labels }}
|
||||
|
||||
3. Re-run this playbook to complete setup
|
||||
when: not runner_config.stat.exists
|
||||
|
||||
- name: Create systemd service for runner
|
||||
template:
|
||||
src: ../templates/gitea-runner.service.j2
|
||||
dest: /etc/systemd/system/gitea-runner.service
|
||||
mode: '0644'
|
||||
notify: Reload systemd
|
||||
|
||||
- name: Enable and start Gitea runner service
|
||||
systemd:
|
||||
name: gitea-runner
|
||||
enabled: yes
|
||||
state: started
|
||||
when: runner_config.stat.exists
|
||||
|
||||
- name: Install Docker (if not present)
|
||||
apt:
|
||||
name:
|
||||
- docker.io
|
||||
- docker-compose
|
||||
state: present
|
||||
update_cache: yes
|
||||
|
||||
- name: Add runner user to docker group
|
||||
user:
|
||||
name: "{{ runner_user }}"
|
||||
groups: docker
|
||||
append: yes
|
||||
|
||||
- name: Ensure Docker service is running
|
||||
systemd:
|
||||
name: docker
|
||||
state: started
|
||||
enabled: yes
|
||||
|
||||
- name: Create Docker network for builds
|
||||
docker_network:
|
||||
name: gitea-runner-network
|
||||
driver: bridge
|
||||
|
||||
- name: Display runner status
|
||||
debug:
|
||||
msg: |
|
||||
✅ Gitea Runner Setup Complete
|
||||
|
||||
Runner Name: {{ runner_name }}
|
||||
Install Dir: {{ runner_install_dir }}
|
||||
Work Dir: {{ runner_work_dir }}
|
||||
|
||||
Check status: systemctl status gitea-runner
|
||||
View logs: journalctl -u gitea-runner -f
|
||||
|
||||
handlers:
|
||||
- name: Reload systemd
|
||||
systemd:
|
||||
daemon_reload: yes
|
||||
@@ -0,0 +1,57 @@
|
||||
---
|
||||
# Ansible Playbook: Setup Production Secrets
|
||||
# Purpose: Deploy Docker Secrets and environment configuration to production
|
||||
# Usage: ansible-playbook -i inventory/production.yml playbooks/setup-production-secrets.yml --ask-vault-pass
|
||||
|
||||
- name: Setup Production Secrets and Environment
|
||||
hosts: production_server
|
||||
become: no
|
||||
vars_files:
|
||||
- ../secrets/production-vault.yml # Encrypted with ansible-vault
|
||||
|
||||
tasks:
|
||||
- name: Ensure secrets directory exists
|
||||
file:
|
||||
path: /home/deploy/secrets
|
||||
state: directory
|
||||
mode: '0700'
|
||||
owner: deploy
|
||||
group: deploy
|
||||
|
||||
- name: Deploy environment file from vault
|
||||
template:
|
||||
src: ../templates/production.env.j2
|
||||
dest: /home/deploy/secrets/.env.production
|
||||
mode: '0600'
|
||||
owner: deploy
|
||||
group: deploy
|
||||
notify: Restart services
|
||||
|
||||
- name: Create Docker secrets (if swarm is initialized)
|
||||
docker_secret:
|
||||
name: "{{ item.name }}"
|
||||
data: "{{ item.value }}"
|
||||
state: present
|
||||
loop:
|
||||
- { name: "db_password", value: "{{ vault_db_password }}" }
|
||||
- { name: "redis_password", value: "{{ vault_redis_password }}" }
|
||||
- { name: "app_key", value: "{{ vault_app_key }}" }
|
||||
- { name: "jwt_secret", value: "{{ vault_jwt_secret }}" }
|
||||
- { name: "registry_password", value: "{{ vault_registry_password }}" }
|
||||
no_log: true # Don't log secrets
|
||||
|
||||
- name: Verify secrets are accessible
|
||||
shell: docker secret ls
|
||||
register: secret_list
|
||||
changed_when: false
|
||||
|
||||
- name: Display deployed secrets (names only)
|
||||
debug:
|
||||
msg: "Deployed secrets: {{ secret_list.stdout_lines }}"
|
||||
|
||||
handlers:
|
||||
- name: Restart services
|
||||
shell: |
|
||||
docker service update --force framework_web
|
||||
docker service update --force framework_queue-worker
|
||||
when: ansible_check_mode is not defined
|
||||
8
.deployment-archive-20251030-111806/ansible/secrets/.gitignore
vendored
Normal file
8
.deployment-archive-20251030-111806/ansible/secrets/.gitignore
vendored
Normal file
@@ -0,0 +1,8 @@
|
||||
# SECURITY: Never commit decrypted vault files
|
||||
production-vault.yml.decrypted
|
||||
*.backup
|
||||
*.tmp
|
||||
|
||||
# Keep encrypted vault in git
|
||||
# Encrypted files are safe to commit
|
||||
!production-vault.yml
|
||||
238
.deployment-archive-20251030-111806/ansible/secrets/README.md
Normal file
238
.deployment-archive-20251030-111806/ansible/secrets/README.md
Normal file
@@ -0,0 +1,238 @@
|
||||
# Production Secrets Management
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains encrypted production secrets managed with Ansible Vault.
|
||||
|
||||
**Security Model**:
|
||||
- Secrets are encrypted at rest with AES256
|
||||
- Vault password is required for deployment
|
||||
- Decrypted files are NEVER committed to git
|
||||
- Production deployment uses secure SSH key authentication
|
||||
|
||||
## Files
|
||||
|
||||
- `production-vault.yml` - **Encrypted** secrets vault (safe to commit)
|
||||
- `.gitignore` - Prevents accidental commit of decrypted files
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Initialize Secrets (First Time)
|
||||
|
||||
```bash
|
||||
cd deployment
|
||||
./scripts/setup-production-secrets.sh init
|
||||
```
|
||||
|
||||
This will:
|
||||
- Generate secure random passwords/keys
|
||||
- Create encrypted vault file
|
||||
- Prompt for vault password (store in password manager!)
|
||||
|
||||
### 2. Deploy Secrets to Production
|
||||
|
||||
```bash
|
||||
./scripts/setup-production-secrets.sh deploy
|
||||
```
|
||||
|
||||
Or via Gitea Actions:
|
||||
1. Go to: https://git.michaelschiemer.de/michael/framework/actions
|
||||
2. Select "Update Production Secrets" workflow
|
||||
3. Click "Run workflow"
|
||||
4. Enter vault password
|
||||
5. Click "Run"
|
||||
|
||||
### 3. Update Secrets Manually
|
||||
|
||||
```bash
|
||||
# Edit encrypted vault
|
||||
ansible-vault edit deployment/ansible/secrets/production-vault.yml
|
||||
|
||||
# Deploy changes
|
||||
./scripts/setup-production-secrets.sh deploy
|
||||
```
|
||||
|
||||
### 4. Rotate Secrets (Monthly Recommended)
|
||||
|
||||
```bash
|
||||
./scripts/setup-production-secrets.sh rotate
|
||||
```
|
||||
|
||||
This will:
|
||||
- Generate new passwords
|
||||
- Update vault
|
||||
- Deploy to production
|
||||
- Restart services
|
||||
|
||||
## Vault Structure
|
||||
|
||||
```yaml
|
||||
# Database
|
||||
vault_db_name: framework_production
|
||||
vault_db_user: framework_app
|
||||
vault_db_password: [auto-generated 32 chars]
|
||||
|
||||
# Redis
|
||||
vault_redis_password: [auto-generated 32 chars]
|
||||
|
||||
# Application
|
||||
vault_app_key: [auto-generated base64 key]
|
||||
vault_jwt_secret: [auto-generated 64 chars]
|
||||
|
||||
# Docker Registry
|
||||
vault_registry_url: git.michaelschiemer.de:5000
|
||||
vault_registry_user: deploy
|
||||
vault_registry_password: [auto-generated 24 chars]
|
||||
|
||||
# Security
|
||||
vault_admin_allowed_ips: "127.0.0.1,::1,94.16.110.151"
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### DO ✅
|
||||
|
||||
- **DO** encrypt vault with strong password
|
||||
- **DO** store vault password in password manager
|
||||
- **DO** rotate secrets monthly
|
||||
- **DO** use `--ask-vault-pass` for deployments
|
||||
- **DO** commit encrypted vault to git
|
||||
- **DO** use different vault passwords per environment
|
||||
|
||||
### DON'T ❌
|
||||
|
||||
- **DON'T** commit decrypted vault files
|
||||
- **DON'T** share vault password via email/chat
|
||||
- **DON'T** use weak vault passwords
|
||||
- **DON'T** decrypt vault on untrusted systems
|
||||
- **DON'T** hardcode secrets in code
|
||||
|
||||
## Ansible Vault Commands
|
||||
|
||||
```bash
|
||||
# Encrypt file
|
||||
ansible-vault encrypt production-vault.yml
|
||||
|
||||
# Decrypt file (for viewing only)
|
||||
ansible-vault decrypt production-vault.yml
|
||||
|
||||
# Edit encrypted file
|
||||
ansible-vault edit production-vault.yml
|
||||
|
||||
# Change vault password
|
||||
ansible-vault rekey production-vault.yml
|
||||
|
||||
# View encrypted file content
|
||||
ansible-vault view production-vault.yml
|
||||
```
|
||||
|
||||
## Deployment Integration
|
||||
|
||||
### Local Deployment
|
||||
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/setup-production-secrets.yml \
|
||||
--ask-vault-pass
|
||||
```
|
||||
|
||||
### CI/CD Deployment (Gitea Actions)
|
||||
|
||||
Vault password stored as Gitea Secret:
|
||||
- Secret name: `ANSIBLE_VAULT_PASSWORD`
|
||||
- Used in workflow: `.gitea/workflows/update-production-secrets.yml`
|
||||
|
||||
### Docker Secrets Integration
|
||||
|
||||
Secrets are deployed as Docker Secrets for secure runtime access:
|
||||
|
||||
```bash
|
||||
# List deployed secrets on production
|
||||
ssh deploy@94.16.110.151 "docker secret ls"
|
||||
|
||||
# Services automatically use secrets via docker-compose
|
||||
services:
|
||||
web:
|
||||
secrets:
|
||||
- db_password
|
||||
- redis_password
|
||||
- app_key
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Decryption failed" Error
|
||||
|
||||
**Cause**: Wrong vault password
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Verify password works
|
||||
ansible-vault view deployment/ansible/secrets/production-vault.yml
|
||||
|
||||
# If forgotten, you must reinitialize (data loss!)
|
||||
./scripts/setup-production-secrets.sh init
|
||||
```
|
||||
|
||||
### Secrets Not Applied After Deployment
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Manually restart services
|
||||
ssh deploy@94.16.110.151 "docker service update --force framework_web"
|
||||
|
||||
# Or use Ansible
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml playbooks/restart-services.yml
|
||||
```
|
||||
|
||||
### Verify Secrets on Production
|
||||
|
||||
```bash
|
||||
./scripts/setup-production-secrets.sh verify
|
||||
|
||||
# Or manually
|
||||
ssh deploy@94.16.110.151 "docker secret ls"
|
||||
ssh deploy@94.16.110.151 "cat /home/deploy/secrets/.env.production | grep -v PASSWORD"
|
||||
```
|
||||
|
||||
## Emergency Procedures
|
||||
|
||||
### Lost Vault Password
|
||||
|
||||
**Recovery Steps**:
|
||||
1. Backup current vault: `cp production-vault.yml production-vault.yml.lost`
|
||||
2. Reinitialize vault: `./scripts/setup-production-secrets.sh init`
|
||||
3. Update database passwords manually on production
|
||||
4. Deploy new secrets: `./scripts/setup-production-secrets.sh deploy`
|
||||
|
||||
### Compromised Secrets
|
||||
|
||||
**Immediate Response**:
|
||||
1. Rotate all secrets: `./scripts/setup-production-secrets.sh rotate`
|
||||
2. Review access logs on production
|
||||
3. Update vault password: `ansible-vault rekey production-vault.yml`
|
||||
4. Audit git commit history
|
||||
5. Investigate compromise source
|
||||
|
||||
## Monitoring
|
||||
|
||||
Check secrets deployment status:
|
||||
|
||||
```bash
|
||||
# Via script
|
||||
./scripts/setup-production-secrets.sh verify
|
||||
|
||||
# Manual check
|
||||
ansible production_server -i inventory/production.yml \
|
||||
-m shell -a "docker secret ls | wc -l"
|
||||
|
||||
# Should show 5 secrets: db_password, redis_password, app_key, jwt_secret, registry_password
|
||||
```
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Ansible Vault Documentation](https://docs.ansible.com/ansible/latest/user_guide/vault.html)
|
||||
- [Docker Secrets Best Practices](https://docs.docker.com/engine/swarm/secrets/)
|
||||
- Main Deployment Guide: `../README.md`
|
||||
@@ -0,0 +1,41 @@
|
||||
---
|
||||
# Production Secrets Vault
|
||||
# IMPORTANT: This file must be encrypted with ansible-vault
|
||||
#
|
||||
# Encrypt this file:
|
||||
# ansible-vault encrypt deployment/ansible/secrets/production-vault.yml
|
||||
#
|
||||
# Edit encrypted file:
|
||||
# ansible-vault edit deployment/ansible/secrets/production-vault.yml
|
||||
#
|
||||
# Decrypt file (for debugging only, never commit decrypted):
|
||||
# ansible-vault decrypt deployment/ansible/secrets/production-vault.yml
|
||||
#
|
||||
# Use in playbook:
|
||||
# ansible-playbook playbooks/setup-production-secrets.yml --ask-vault-pass
|
||||
|
||||
# Database Credentials
|
||||
vault_db_name: framework_production
|
||||
vault_db_user: framework_app
|
||||
vault_db_password: CHANGE_ME_STRONG_DB_PASSWORD_HERE
|
||||
|
||||
# Redis Credentials
|
||||
vault_redis_password: CHANGE_ME_STRONG_REDIS_PASSWORD_HERE
|
||||
|
||||
# Application Secrets
|
||||
vault_app_key: CHANGE_ME_BASE64_ENCODED_32_BYTE_KEY
|
||||
vault_jwt_secret: CHANGE_ME_STRONG_JWT_SECRET_HERE
|
||||
|
||||
# Docker Registry Credentials
|
||||
vault_registry_url: git.michaelschiemer.de:5000
|
||||
vault_registry_user: deploy
|
||||
vault_registry_password: CHANGE_ME_REGISTRY_PASSWORD_HERE
|
||||
|
||||
# Security Configuration
|
||||
vault_admin_allowed_ips: "127.0.0.1,::1,94.16.110.151"
|
||||
|
||||
# SMTP Configuration (optional)
|
||||
vault_smtp_host: smtp.example.com
|
||||
vault_smtp_port: 587
|
||||
vault_smtp_user: noreply@michaelschiemer.de
|
||||
vault_smtp_password: CHANGE_ME_SMTP_PASSWORD_HERE
|
||||
@@ -0,0 +1,26 @@
|
||||
[Unit]
|
||||
Description=Gitea Actions Runner
|
||||
After=network.target docker.service
|
||||
Requires=docker.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User={{ runner_user }}
|
||||
WorkingDirectory={{ runner_install_dir }}
|
||||
ExecStart={{ runner_install_dir }}/act_runner daemon --config {{ runner_install_dir }}/.runner
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
|
||||
# Security hardening
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths={{ runner_work_dir }}
|
||||
|
||||
# Resource limits
|
||||
LimitNOFILE=65536
|
||||
LimitNPROC=4096
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@@ -0,0 +1,50 @@
|
||||
# Production Environment Configuration
|
||||
# Generated by Ansible - DO NOT EDIT MANUALLY
|
||||
# Last updated: {{ ansible_date_time.iso8601 }}
|
||||
|
||||
# Application
|
||||
APP_ENV=production
|
||||
APP_DEBUG=false
|
||||
APP_KEY={{ vault_app_key }}
|
||||
APP_URL=https://michaelschiemer.de
|
||||
|
||||
# Database
|
||||
DB_CONNECTION=mysql
|
||||
DB_HOST=mysql
|
||||
DB_PORT=3306
|
||||
DB_DATABASE={{ vault_db_name }}
|
||||
DB_USERNAME={{ vault_db_user }}
|
||||
DB_PASSWORD={{ vault_db_password }}
|
||||
|
||||
# Redis
|
||||
REDIS_HOST=redis
|
||||
REDIS_PASSWORD={{ vault_redis_password }}
|
||||
REDIS_PORT=6379
|
||||
|
||||
# Cache
|
||||
CACHE_DRIVER=redis
|
||||
QUEUE_CONNECTION=redis
|
||||
|
||||
# Session
|
||||
SESSION_DRIVER=redis
|
||||
SESSION_LIFETIME=120
|
||||
|
||||
# JWT
|
||||
JWT_SECRET={{ vault_jwt_secret }}
|
||||
JWT_TTL=60
|
||||
|
||||
# Docker Registry
|
||||
REGISTRY_URL={{ vault_registry_url }}
|
||||
REGISTRY_USER={{ vault_registry_user }}
|
||||
REGISTRY_PASSWORD={{ vault_registry_password }}
|
||||
|
||||
# Logging
|
||||
LOG_CHANNEL=stack
|
||||
LOG_LEVEL=warning
|
||||
|
||||
# Security
|
||||
ADMIN_ALLOWED_IPS={{ vault_admin_allowed_ips }}
|
||||
|
||||
# Performance
|
||||
OPCACHE_ENABLE=1
|
||||
OPCACHE_VALIDATE_TIMESTAMPS=0
|
||||
241
.deployment-archive-20251030-111806/scripts/deploy.sh
Executable file
241
.deployment-archive-20251030-111806/scripts/deploy.sh
Executable file
@@ -0,0 +1,241 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Main Deployment Script
|
||||
# Uses script framework for professional deployment automation
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# Determine script directory
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
# Source libraries
|
||||
# shellcheck source=./lib/common.sh
|
||||
source "${SCRIPT_DIR}/lib/common.sh"
|
||||
# shellcheck source=./lib/ansible.sh
|
||||
source "${SCRIPT_DIR}/lib/ansible.sh"
|
||||
|
||||
# Configuration
|
||||
readonly DEPLOYMENT_NAME="Framework Production Deployment"
|
||||
readonly START_TIME=$(date +%s)
|
||||
|
||||
# Usage information
|
||||
usage() {
|
||||
cat << EOF
|
||||
Usage: $0 [OPTIONS] [GIT_REPO_URL]
|
||||
|
||||
Professional deployment automation using Ansible.
|
||||
|
||||
OPTIONS:
|
||||
-h, --help Show this help message
|
||||
-c, --check Run in check mode (dry-run)
|
||||
-v, --verbose Enable verbose output
|
||||
-d, --debug Enable debug logging
|
||||
-f, --force Skip confirmation prompts
|
||||
--no-health-check Skip health checks
|
||||
|
||||
EXAMPLES:
|
||||
# Deploy from existing code on server
|
||||
$0
|
||||
|
||||
# Deploy from specific Git repository
|
||||
$0 https://github.com/user/repo.git
|
||||
|
||||
# Dry-run to see what would happen
|
||||
$0 --check
|
||||
|
||||
# Debug mode
|
||||
$0 --debug
|
||||
|
||||
EOF
|
||||
exit 0
|
||||
}
|
||||
|
||||
# Parse command line arguments
|
||||
parse_args() {
|
||||
local git_repo_url=""
|
||||
local check_mode=false
|
||||
local force=false
|
||||
local health_check=true
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
-h|--help)
|
||||
usage
|
||||
;;
|
||||
-c|--check)
|
||||
check_mode=true
|
||||
shift
|
||||
;;
|
||||
-v|--verbose)
|
||||
set -x
|
||||
shift
|
||||
;;
|
||||
-d|--debug)
|
||||
export DEBUG=1
|
||||
shift
|
||||
;;
|
||||
-f|--force)
|
||||
force=true
|
||||
shift
|
||||
;;
|
||||
--no-health-check)
|
||||
health_check=false
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
if [[ -z "$git_repo_url" ]]; then
|
||||
git_repo_url="$1"
|
||||
else
|
||||
log_error "Unknown argument: $1"
|
||||
usage
|
||||
fi
|
||||
shift
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
echo "$check_mode|$force|$health_check|$git_repo_url"
|
||||
}
|
||||
|
||||
# Pre-deployment checks
|
||||
pre_deployment_checks() {
|
||||
log_step "Running pre-deployment checks..."
|
||||
|
||||
# Check Ansible
|
||||
check_ansible || die "Ansible check failed"
|
||||
|
||||
# Test connectivity
|
||||
test_ansible_connectivity || die "Connectivity check failed"
|
||||
|
||||
# Check playbook syntax
|
||||
local playbook="${ANSIBLE_PLAYBOOK_DIR}/deploy.yml"
|
||||
if [[ -f "$playbook" ]]; then
|
||||
check_playbook_syntax "$playbook" || log_warning "Playbook syntax check failed"
|
||||
fi
|
||||
|
||||
log_success "Pre-deployment checks passed"
|
||||
}
|
||||
|
||||
# Deployment summary
|
||||
show_deployment_summary() {
|
||||
local git_repo_url="$1"
|
||||
local check_mode="$2"
|
||||
|
||||
echo ""
|
||||
echo "========================================="
|
||||
echo " ${DEPLOYMENT_NAME}"
|
||||
echo "========================================="
|
||||
echo ""
|
||||
echo "Mode: $([ "$check_mode" = "true" ] && echo "CHECK (Dry-Run)" || echo "PRODUCTION")"
|
||||
echo "Target: 94.16.110.151 (production)"
|
||||
echo "Services: framework_web, framework_queue-worker"
|
||||
|
||||
if [[ -n "$git_repo_url" ]]; then
|
||||
echo "Git Repo: $git_repo_url"
|
||||
else
|
||||
echo "Source: Existing code on server"
|
||||
fi
|
||||
|
||||
echo "Ansible: $(ansible --version | head -1)"
|
||||
echo "Timestamp: $(timestamp)"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Post-deployment health check
|
||||
post_deployment_health_check() {
|
||||
log_step "Running post-deployment health checks..."
|
||||
|
||||
log_info "Checking service status..."
|
||||
if ansible_adhoc production_server shell "docker stack services framework" &> /dev/null; then
|
||||
log_success "Services are running"
|
||||
else
|
||||
log_warning "Could not verify service status"
|
||||
fi
|
||||
|
||||
log_info "Testing website availability..."
|
||||
if ansible_adhoc production_server shell "curl -k -s -o /dev/null -w '%{http_code}' https://michaelschiemer.de/" | grep -q "200\|302"; then
|
||||
log_success "Website is responding"
|
||||
else
|
||||
log_warning "Website health check failed"
|
||||
fi
|
||||
|
||||
log_success "Health checks completed"
|
||||
}
|
||||
|
||||
# Main deployment function
|
||||
main() {
|
||||
# Parse arguments
|
||||
IFS='|' read -r check_mode force health_check git_repo_url <<< "$(parse_args "$@")"
|
||||
|
||||
# Show summary
|
||||
show_deployment_summary "$git_repo_url" "$check_mode"
|
||||
|
||||
# Confirm deployment
|
||||
if [[ "$force" != "true" ]] && [[ "$check_mode" != "true" ]]; then
|
||||
if ! confirm "Proceed with deployment?" "n"; then
|
||||
log_warning "Deployment cancelled by user"
|
||||
exit 0
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
|
||||
# Pre-deployment checks
|
||||
pre_deployment_checks
|
||||
|
||||
# Run deployment
|
||||
log_step "Starting deployment..."
|
||||
echo ""
|
||||
|
||||
if [[ "$check_mode" = "true" ]]; then
|
||||
local playbook="${ANSIBLE_PLAYBOOK_DIR}/deploy.yml"
|
||||
ansible_dry_run "$playbook" ${git_repo_url:+-e "git_repo_url=$git_repo_url"}
|
||||
else
|
||||
run_deployment "$git_repo_url"
|
||||
fi
|
||||
|
||||
local deployment_exit_code=$?
|
||||
|
||||
if [[ $deployment_exit_code -eq 0 ]]; then
|
||||
echo ""
|
||||
log_success "Deployment completed successfully!"
|
||||
|
||||
# Post-deployment health check
|
||||
if [[ "$health_check" = "true" ]] && [[ "$check_mode" != "true" ]]; then
|
||||
echo ""
|
||||
post_deployment_health_check
|
||||
fi
|
||||
|
||||
# Show deployment stats
|
||||
local end_time=$(date +%s)
|
||||
local elapsed=$(duration "$START_TIME" "$end_time")
|
||||
|
||||
echo ""
|
||||
echo "========================================="
|
||||
echo " Deployment Summary"
|
||||
echo "========================================="
|
||||
echo "Status: SUCCESS ✅"
|
||||
echo "Duration: $elapsed"
|
||||
echo "Website: https://michaelschiemer.de"
|
||||
echo "Timestamp: $(timestamp)"
|
||||
echo "========================================="
|
||||
echo ""
|
||||
|
||||
return 0
|
||||
else
|
||||
echo ""
|
||||
log_error "Deployment failed!"
|
||||
echo ""
|
||||
log_info "Troubleshooting:"
|
||||
log_info " 1. Check Ansible logs above"
|
||||
log_info " 2. SSH to server: ssh -i ~/.ssh/production deploy@94.16.110.151"
|
||||
log_info " 3. Check services: docker stack services framework"
|
||||
log_info " 4. View logs: docker service logs framework_web --tail 50"
|
||||
echo ""
|
||||
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Execute main function
|
||||
main "$@"
|
||||
361
.deployment-archive-20251030-111806/scripts/deployment-diagnostics.sh
Executable file
361
.deployment-archive-20251030-111806/scripts/deployment-diagnostics.sh
Executable file
@@ -0,0 +1,361 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Deployment Diagnostics Script
|
||||
# Purpose: Comprehensive diagnostics for troubleshooting deployment issues
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/deployment-diagnostics.sh # Run all diagnostics
|
||||
# ./scripts/deployment-diagnostics.sh --quick # Quick checks only
|
||||
# ./scripts/deployment-diagnostics.sh --verbose # Verbose output
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
|
||||
PRODUCTION_SERVER="94.16.110.151"
|
||||
REGISTRY="git.michaelschiemer.de:5000"
|
||||
STACK_NAME="framework"
|
||||
IMAGE="framework"
|
||||
|
||||
QUICK_MODE=false
|
||||
VERBOSE=false
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m'
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}✗${NC} $1"
|
||||
}
|
||||
|
||||
log_success() {
|
||||
echo -e "${GREEN}✓${NC} $1"
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}⚠${NC} $1"
|
||||
}
|
||||
|
||||
log_info() {
|
||||
echo -e "${BLUE}ℹ${NC} $1"
|
||||
}
|
||||
|
||||
log_section() {
|
||||
echo ""
|
||||
echo -e "${CYAN}═══ $1 ═══${NC}"
|
||||
}
|
||||
|
||||
# SSH helper
|
||||
ssh_exec() {
|
||||
ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" "$@" 2>/dev/null || echo "SSH_FAILED"
|
||||
}
|
||||
|
||||
# Check local prerequisites
|
||||
check_local() {
|
||||
log_section "Local Environment"
|
||||
|
||||
# Git status
|
||||
if git status &> /dev/null; then
|
||||
log_success "Git repository detected"
|
||||
BRANCH=$(git rev-parse --abbrev-ref HEAD)
|
||||
log_info "Current branch: ${BRANCH}"
|
||||
|
||||
if [[ -n $(git status --porcelain) ]]; then
|
||||
log_warn "Working directory has uncommitted changes"
|
||||
else
|
||||
log_success "Working directory is clean"
|
||||
fi
|
||||
else
|
||||
log_error "Not in a git repository"
|
||||
fi
|
||||
|
||||
# Docker
|
||||
if command -v docker &> /dev/null; then
|
||||
log_success "Docker installed"
|
||||
DOCKER_VERSION=$(docker --version | cut -d' ' -f3 | tr -d ',')
|
||||
log_info "Version: ${DOCKER_VERSION}"
|
||||
else
|
||||
log_error "Docker not found"
|
||||
fi
|
||||
|
||||
# Ansible
|
||||
if command -v ansible-playbook &> /dev/null; then
|
||||
log_success "Ansible installed"
|
||||
ANSIBLE_VERSION=$(ansible-playbook --version | head -1 | cut -d' ' -f2)
|
||||
log_info "Version: ${ANSIBLE_VERSION}"
|
||||
else
|
||||
log_error "Ansible not found"
|
||||
fi
|
||||
|
||||
# SSH key
|
||||
if [[ -f ~/.ssh/production ]]; then
|
||||
log_success "Production SSH key found"
|
||||
else
|
||||
log_error "Production SSH key not found at ~/.ssh/production"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check SSH connectivity
|
||||
check_ssh() {
|
||||
log_section "SSH Connectivity"
|
||||
|
||||
RESULT=$(ssh_exec "echo 'OK'")
|
||||
|
||||
if [[ "$RESULT" == "OK" ]]; then
|
||||
log_success "SSH connection to production server"
|
||||
else
|
||||
log_error "Cannot connect to production server via SSH"
|
||||
log_info "Check: ssh -i ~/.ssh/production deploy@${PRODUCTION_SERVER}"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Check Docker Swarm
|
||||
check_docker_swarm() {
|
||||
log_section "Docker Swarm Status"
|
||||
|
||||
SWARM_STATUS=$(ssh_exec "docker info | grep 'Swarm:' | awk '{print \$2}'")
|
||||
|
||||
if [[ "$SWARM_STATUS" == "active" ]]; then
|
||||
log_success "Docker Swarm is active"
|
||||
|
||||
# Manager nodes
|
||||
MANAGERS=$(ssh_exec "docker node ls --filter role=manager --format '{{.Hostname}}'")
|
||||
log_info "Manager nodes: ${MANAGERS}"
|
||||
|
||||
# Worker nodes
|
||||
WORKERS=$(ssh_exec "docker node ls --filter role=worker --format '{{.Hostname}}' | wc -l")
|
||||
log_info "Worker nodes: ${WORKERS}"
|
||||
else
|
||||
log_error "Docker Swarm is not active"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Check services
|
||||
check_services() {
|
||||
log_section "Framework Services"
|
||||
|
||||
# List services
|
||||
SERVICES=$(ssh_exec "docker service ls --filter 'name=${STACK_NAME}' --format '{{.Name}}: {{.Replicas}}'")
|
||||
|
||||
if [[ -n "$SERVICES" ]]; then
|
||||
log_success "Framework services found"
|
||||
echo "$SERVICES" | while read -r line; do
|
||||
log_info "$line"
|
||||
done
|
||||
else
|
||||
log_error "No framework services found"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Check web service
|
||||
WEB_STATUS=$(ssh_exec "docker service ps ${STACK_NAME}_web --filter 'desired-state=running' --format '{{.CurrentState}}' | head -1")
|
||||
|
||||
if [[ "$WEB_STATUS" =~ Running ]]; then
|
||||
log_success "Web service is running"
|
||||
else
|
||||
log_error "Web service is not running: ${WEB_STATUS}"
|
||||
fi
|
||||
|
||||
# Check worker service
|
||||
WORKER_STATUS=$(ssh_exec "docker service ps ${STACK_NAME}_queue-worker --filter 'desired-state=running' --format '{{.CurrentState}}' | head -1")
|
||||
|
||||
if [[ "$WORKER_STATUS" =~ Running ]]; then
|
||||
log_success "Queue worker is running"
|
||||
else
|
||||
log_error "Queue worker is not running: ${WORKER_STATUS}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check Docker images
|
||||
check_images() {
|
||||
log_section "Docker Images"
|
||||
|
||||
# Current running image
|
||||
CURRENT_IMAGE=$(ssh_exec "docker service inspect ${STACK_NAME}_web --format '{{.Spec.TaskTemplate.ContainerSpec.Image}}'")
|
||||
|
||||
if [[ -n "$CURRENT_IMAGE" ]]; then
|
||||
log_success "Current image: ${CURRENT_IMAGE}"
|
||||
else
|
||||
log_error "Cannot determine current image"
|
||||
fi
|
||||
|
||||
# Available images (last 5)
|
||||
log_info "Available images (last 5):"
|
||||
ssh_exec "docker images ${REGISTRY}/${IMAGE} --format ' {{.Tag}} ({{.CreatedAt}})' | grep -v buildcache | head -5"
|
||||
}
|
||||
|
||||
# Check networks
|
||||
check_networks() {
|
||||
log_section "Docker Networks"
|
||||
|
||||
NETWORKS=$(ssh_exec "docker network ls --filter 'name=${STACK_NAME}' --format '{{.Name}}: {{.Driver}}'")
|
||||
|
||||
if [[ -n "$NETWORKS" ]]; then
|
||||
log_success "Framework networks found"
|
||||
echo "$NETWORKS" | while read -r line; do
|
||||
log_info "$line"
|
||||
done
|
||||
else
|
||||
log_warn "No framework-specific networks found"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check volumes
|
||||
check_volumes() {
|
||||
log_section "Docker Volumes"
|
||||
|
||||
VOLUMES=$(ssh_exec "docker volume ls --filter 'name=${STACK_NAME}' --format '{{.Name}}'")
|
||||
|
||||
if [[ -n "$VOLUMES" ]]; then
|
||||
log_success "Framework volumes found"
|
||||
echo "$VOLUMES" | while read -r line; do
|
||||
log_info "$line"
|
||||
done
|
||||
else
|
||||
log_warn "No framework-specific volumes found"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check application health
|
||||
check_app_health() {
|
||||
log_section "Application Health"
|
||||
|
||||
# Main health endpoint
|
||||
HTTP_CODE=$(curl -k -s -o /dev/null -w "%{http_code}" https://michaelschiemer.de/health || echo "000")
|
||||
|
||||
if [[ "$HTTP_CODE" == "200" ]] || [[ "$HTTP_CODE" == "302" ]]; then
|
||||
log_success "Application health endpoint: ${HTTP_CODE}"
|
||||
else
|
||||
log_error "Application health endpoint failed: ${HTTP_CODE}"
|
||||
fi
|
||||
|
||||
# Database health
|
||||
DB_CODE=$(curl -k -s -o /dev/null -w "%{http_code}" https://michaelschiemer.de/health/database || echo "000")
|
||||
|
||||
if [[ "$DB_CODE" == "200" ]]; then
|
||||
log_success "Database connectivity: OK"
|
||||
else
|
||||
log_warn "Database connectivity: ${DB_CODE}"
|
||||
fi
|
||||
|
||||
# Redis health
|
||||
REDIS_CODE=$(curl -k -s -o /dev/null -w "%{http_code}" https://michaelschiemer.de/health/redis || echo "000")
|
||||
|
||||
if [[ "$REDIS_CODE" == "200" ]]; then
|
||||
log_success "Redis connectivity: OK"
|
||||
else
|
||||
log_warn "Redis connectivity: ${REDIS_CODE}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check Docker secrets
|
||||
check_secrets() {
|
||||
log_section "Docker Secrets"
|
||||
|
||||
SECRETS=$(ssh_exec "docker secret ls --format '{{.Name}}' | wc -l")
|
||||
|
||||
if [[ "$SECRETS" -gt 0 ]]; then
|
||||
log_success "Docker secrets configured: ${SECRETS} secrets"
|
||||
else
|
||||
log_warn "No Docker secrets found"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check recent logs
|
||||
check_logs() {
|
||||
log_section "Recent Logs"
|
||||
|
||||
log_info "Last 20 lines from web service:"
|
||||
ssh_exec "docker service logs ${STACK_NAME}_web --tail 20"
|
||||
}
|
||||
|
||||
# Check Gitea runner
|
||||
check_gitea_runner() {
|
||||
log_section "Gitea Actions Runner"
|
||||
|
||||
RUNNER_STATUS=$(ssh_exec "systemctl is-active gitea-runner 2>/dev/null || echo 'not-found'")
|
||||
|
||||
if [[ "$RUNNER_STATUS" == "active" ]]; then
|
||||
log_success "Gitea runner service is active"
|
||||
elif [[ "$RUNNER_STATUS" == "not-found" ]]; then
|
||||
log_warn "Gitea runner service not found (may not be installed yet)"
|
||||
else
|
||||
log_error "Gitea runner service is ${RUNNER_STATUS}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Resource usage
|
||||
check_resources() {
|
||||
log_section "Resource Usage"
|
||||
|
||||
# Disk usage
|
||||
DISK_USAGE=$(ssh_exec "df -h / | tail -1 | awk '{print \$5}'")
|
||||
log_info "Disk usage: ${DISK_USAGE}"
|
||||
|
||||
# Memory usage
|
||||
MEMORY_USAGE=$(ssh_exec "free -h | grep Mem | awk '{print \$3\"/\"\$2}'")
|
||||
log_info "Memory usage: ${MEMORY_USAGE}"
|
||||
|
||||
# Docker disk usage
|
||||
log_info "Docker disk usage:"
|
||||
ssh_exec "docker system df"
|
||||
}
|
||||
|
||||
# Parse arguments
|
||||
for arg in "$@"; do
|
||||
case $arg in
|
||||
--quick)
|
||||
QUICK_MODE=true
|
||||
;;
|
||||
--verbose)
|
||||
VERBOSE=true
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Main diagnostics
|
||||
main() {
|
||||
echo ""
|
||||
echo -e "${CYAN}╔════════════════════════════════════════════════════════╗${NC}"
|
||||
echo -e "${CYAN}║ DEPLOYMENT DIAGNOSTICS REPORT ║${NC}"
|
||||
echo -e "${CYAN}╚════════════════════════════════════════════════════════╝${NC}"
|
||||
echo ""
|
||||
|
||||
check_local
|
||||
check_ssh || { log_error "SSH connectivity failed - cannot continue"; exit 1; }
|
||||
check_docker_swarm
|
||||
check_services
|
||||
check_images
|
||||
check_app_health
|
||||
|
||||
if [[ "$QUICK_MODE" == false ]]; then
|
||||
check_networks
|
||||
check_volumes
|
||||
check_secrets
|
||||
check_gitea_runner
|
||||
check_resources
|
||||
|
||||
if [[ "$VERBOSE" == true ]]; then
|
||||
check_logs
|
||||
fi
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${CYAN}╔════════════════════════════════════════════════════════╗${NC}"
|
||||
echo -e "${CYAN}║ DIAGNOSTICS COMPLETED ║${NC}"
|
||||
echo -e "${CYAN}╚════════════════════════════════════════════════════════╝${NC}"
|
||||
echo ""
|
||||
log_info "For detailed logs: ./scripts/deployment-diagnostics.sh --verbose"
|
||||
log_info "For service recovery: ./scripts/service-recovery.sh recover"
|
||||
echo ""
|
||||
}
|
||||
|
||||
main "$@"
|
||||
171
.deployment-archive-20251030-111806/scripts/emergency-rollback.sh
Executable file
171
.deployment-archive-20251030-111806/scripts/emergency-rollback.sh
Executable file
@@ -0,0 +1,171 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Emergency Rollback Script
|
||||
# Purpose: Fast rollback with minimal user interaction
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/emergency-rollback.sh # Interactive mode
|
||||
# ./scripts/emergency-rollback.sh <image-tag> # Direct rollback
|
||||
# ./scripts/emergency-rollback.sh list # List available tags
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
ANSIBLE_DIR="${PROJECT_ROOT}/deployment/ansible"
|
||||
INVENTORY="${ANSIBLE_DIR}/inventory/production.yml"
|
||||
|
||||
PRODUCTION_SERVER="94.16.110.151"
|
||||
REGISTRY="git.michaelschiemer.de:5000"
|
||||
IMAGE="framework"
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[ERROR]${NC} $1" >&2
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[WARN]${NC} $1"
|
||||
}
|
||||
|
||||
log_info() {
|
||||
echo -e "${GREEN}[INFO]${NC} $1"
|
||||
}
|
||||
|
||||
# List available image tags
|
||||
list_tags() {
|
||||
log_info "Fetching available image tags from production..."
|
||||
|
||||
ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" \
|
||||
"docker images ${REGISTRY}/${IMAGE} --format '{{.Tag}}' | grep -v buildcache | head -20"
|
||||
|
||||
echo ""
|
||||
log_info "Current running version:"
|
||||
ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" \
|
||||
"docker service inspect framework_web --format '{{.Spec.TaskTemplate.ContainerSpec.Image}}'"
|
||||
}
|
||||
|
||||
# Get current image tag
|
||||
get_current_tag() {
|
||||
ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" \
|
||||
"docker service inspect framework_web --format '{{.Spec.TaskTemplate.ContainerSpec.Image}}' | cut -d':' -f2"
|
||||
}
|
||||
|
||||
# Emergency rollback
|
||||
emergency_rollback() {
|
||||
local target_tag="$1"
|
||||
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ 🚨 EMERGENCY ROLLBACK INITIATED 🚨 ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
|
||||
local current_tag=$(get_current_tag)
|
||||
|
||||
echo "Current Version: ${current_tag}"
|
||||
echo "Target Version: ${target_tag}"
|
||||
echo ""
|
||||
|
||||
if [[ "${current_tag}" == "${target_tag}" ]]; then
|
||||
log_warn "Already running ${target_tag}. No rollback needed."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
log_warn "This will immediately rollback production WITHOUT health checks."
|
||||
log_warn "Use only in emergency situations."
|
||||
echo ""
|
||||
read -p "Type 'ROLLBACK' to confirm: " -r
|
||||
|
||||
if [[ ! "$REPLY" == "ROLLBACK" ]]; then
|
||||
log_info "Rollback cancelled"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
log_info "Executing emergency rollback via Ansible..."
|
||||
|
||||
cd "${ANSIBLE_DIR}"
|
||||
|
||||
ansible-playbook \
|
||||
-i "${INVENTORY}" \
|
||||
playbooks/emergency-rollback.yml \
|
||||
-e "rollback_tag=${target_tag}"
|
||||
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ MANUAL VERIFICATION REQUIRED ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
log_warn "1. Check application: https://michaelschiemer.de"
|
||||
log_warn "2. Run health check: cd deployment && ansible-playbook -i ansible/inventory/production.yml ansible/playbooks/health-check.yml"
|
||||
log_warn "3. Check service logs: ssh deploy@${PRODUCTION_SERVER} 'docker service logs framework_web --tail 100'"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Interactive mode
|
||||
interactive_rollback() {
|
||||
log_info "🚨 Emergency Rollback - Interactive Mode"
|
||||
echo ""
|
||||
|
||||
log_info "Available image tags (last 20):"
|
||||
list_tags
|
||||
echo ""
|
||||
|
||||
read -p "Enter image tag to rollback to: " -r target_tag
|
||||
|
||||
if [[ -z "$target_tag" ]]; then
|
||||
log_error "No tag provided"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
emergency_rollback "$target_tag"
|
||||
}
|
||||
|
||||
# Main
|
||||
main() {
|
||||
case "${1:-interactive}" in
|
||||
list)
|
||||
list_tags
|
||||
;;
|
||||
interactive)
|
||||
interactive_rollback
|
||||
;;
|
||||
help|--help|-h)
|
||||
cat <<EOF
|
||||
Emergency Rollback Script
|
||||
|
||||
Usage: $0 [command|tag]
|
||||
|
||||
Commands:
|
||||
list List available image tags on production
|
||||
interactive Interactive rollback mode (default)
|
||||
<image-tag> Direct rollback to specific tag
|
||||
help Show this help
|
||||
|
||||
Examples:
|
||||
$0 list # List available versions
|
||||
$0 # Interactive mode
|
||||
$0 abc1234-123456 # Rollback to specific tag
|
||||
|
||||
Emergency Procedures:
|
||||
1. List versions: $0 list
|
||||
2. Choose version: $0 <tag>
|
||||
3. Verify manually: https://michaelschiemer.de
|
||||
4. Run health check: cd deployment && ansible-playbook -i ansible/inventory/production.yml ansible/playbooks/health-check.yml
|
||||
|
||||
EOF
|
||||
;;
|
||||
*)
|
||||
# Direct rollback with provided tag
|
||||
emergency_rollback "$1"
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
main "$@"
|
||||
160
.deployment-archive-20251030-111806/scripts/lib/ansible.sh
Executable file
160
.deployment-archive-20251030-111806/scripts/lib/ansible.sh
Executable file
@@ -0,0 +1,160 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Ansible Integration Library
|
||||
# Provides helpers for Ansible operations
|
||||
#
|
||||
|
||||
# Source common library
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
# shellcheck source=./common.sh
|
||||
source "${SCRIPT_DIR}/common.sh"
|
||||
|
||||
# Default Ansible paths
|
||||
readonly ANSIBLE_DIR="${ANSIBLE_DIR:-${SCRIPT_DIR}/../../ansible}"
|
||||
readonly ANSIBLE_INVENTORY="${ANSIBLE_INVENTORY:-${ANSIBLE_DIR}/inventory/production.yml}"
|
||||
readonly ANSIBLE_PLAYBOOK_DIR="${ANSIBLE_PLAYBOOK_DIR:-${ANSIBLE_DIR}/playbooks}"
|
||||
|
||||
# Check Ansible installation
|
||||
check_ansible() {
|
||||
log_step "Checking Ansible installation..."
|
||||
|
||||
require_command "ansible" "sudo apt install ansible" || return 1
|
||||
require_command "ansible-playbook" || return 1
|
||||
|
||||
local version
|
||||
version=$(ansible --version | head -1)
|
||||
log_success "Ansible installed: $version"
|
||||
}
|
||||
|
||||
# Test Ansible connectivity
|
||||
test_ansible_connectivity() {
|
||||
local inventory="${1:-$ANSIBLE_INVENTORY}"
|
||||
|
||||
log_step "Testing Ansible connectivity..."
|
||||
|
||||
if ! ansible all -i "$inventory" -m ping &> /dev/null; then
|
||||
log_error "Cannot connect to production server"
|
||||
log_info "Check:"
|
||||
log_info " - SSH key: ~/.ssh/production"
|
||||
log_info " - Network connectivity"
|
||||
log_info " - Server availability"
|
||||
return 1
|
||||
fi
|
||||
|
||||
log_success "Connection successful"
|
||||
return 0
|
||||
}
|
||||
|
||||
# Run Ansible playbook
|
||||
run_ansible_playbook() {
|
||||
local playbook="$1"
|
||||
shift
|
||||
local extra_args=("$@")
|
||||
|
||||
log_step "Running Ansible playbook: $(basename "$playbook")"
|
||||
|
||||
# Build command
|
||||
local cmd="ansible-playbook -i ${ANSIBLE_INVENTORY} ${playbook}"
|
||||
|
||||
# Add extra args
|
||||
if [[ ${#extra_args[@]} -gt 0 ]]; then
|
||||
cmd="${cmd} ${extra_args[*]}"
|
||||
fi
|
||||
|
||||
log_debug "Command: $cmd"
|
||||
|
||||
# Execute with proper error handling
|
||||
if eval "$cmd"; then
|
||||
log_success "Playbook completed successfully"
|
||||
return 0
|
||||
else
|
||||
local exit_code=$?
|
||||
log_error "Playbook failed with exit code $exit_code"
|
||||
return $exit_code
|
||||
fi
|
||||
}
|
||||
|
||||
# Run deployment playbook
|
||||
run_deployment() {
|
||||
local git_repo_url="${1:-}"
|
||||
local playbook="${ANSIBLE_PLAYBOOK_DIR}/deploy.yml"
|
||||
|
||||
if [[ ! -f "$playbook" ]]; then
|
||||
log_error "Deployment playbook not found: $playbook"
|
||||
return 1
|
||||
fi
|
||||
|
||||
log_step "Starting deployment..."
|
||||
|
||||
local extra_args=()
|
||||
if [[ -n "$git_repo_url" ]]; then
|
||||
extra_args+=("-e" "git_repo_url=${git_repo_url}")
|
||||
log_info "Git repository: $git_repo_url"
|
||||
else
|
||||
log_info "Using existing code on server"
|
||||
fi
|
||||
|
||||
run_ansible_playbook "$playbook" "${extra_args[@]}"
|
||||
}
|
||||
|
||||
# Get Ansible facts
|
||||
get_ansible_facts() {
|
||||
local inventory="${1:-$ANSIBLE_INVENTORY}"
|
||||
local host="${2:-production_server}"
|
||||
|
||||
ansible "$host" -i "$inventory" -m setup
|
||||
}
|
||||
|
||||
# Ansible dry-run
|
||||
ansible_dry_run() {
|
||||
local playbook="$1"
|
||||
shift
|
||||
local extra_args=("$@")
|
||||
|
||||
log_step "Running dry-run (check mode)..."
|
||||
|
||||
extra_args+=("--check" "--diff")
|
||||
|
||||
run_ansible_playbook "$playbook" "${extra_args[@]}"
|
||||
}
|
||||
|
||||
# List Ansible hosts
|
||||
list_ansible_hosts() {
|
||||
local inventory="${1:-$ANSIBLE_INVENTORY}"
|
||||
|
||||
log_step "Listing Ansible hosts..."
|
||||
|
||||
ansible-inventory -i "$inventory" --list
|
||||
}
|
||||
|
||||
# Check playbook syntax
|
||||
check_playbook_syntax() {
|
||||
local playbook="$1"
|
||||
|
||||
log_step "Checking playbook syntax..."
|
||||
|
||||
if ansible-playbook --syntax-check "$playbook" &> /dev/null; then
|
||||
log_success "Syntax check passed"
|
||||
return 0
|
||||
else
|
||||
log_error "Syntax check failed"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Execute Ansible ad-hoc command
|
||||
ansible_adhoc() {
|
||||
local host="$1"
|
||||
local module="$2"
|
||||
shift 2
|
||||
local args=("$@")
|
||||
|
||||
log_step "Running ad-hoc command on $host..."
|
||||
|
||||
ansible "$host" -i "$ANSIBLE_INVENTORY" -m "$module" -a "${args[*]}"
|
||||
}
|
||||
|
||||
# Export functions
|
||||
export -f check_ansible test_ansible_connectivity run_ansible_playbook
|
||||
export -f run_deployment get_ansible_facts ansible_dry_run
|
||||
export -f list_ansible_hosts check_playbook_syntax ansible_adhoc
|
||||
215
.deployment-archive-20251030-111806/scripts/lib/common.sh
Executable file
215
.deployment-archive-20251030-111806/scripts/lib/common.sh
Executable file
@@ -0,0 +1,215 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Common Library Functions for Deployment Scripts
|
||||
# Provides unified logging, error handling, and utilities
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# Colors for output
|
||||
readonly RED='\033[0;31m'
|
||||
readonly GREEN='\033[0;32m'
|
||||
readonly YELLOW='\033[1;33m'
|
||||
readonly BLUE='\033[0;34m'
|
||||
readonly CYAN='\033[0;36m'
|
||||
readonly MAGENTA='\033[0;35m'
|
||||
readonly NC='\033[0m' # No Color
|
||||
|
||||
# Logging functions
|
||||
log_info() {
|
||||
echo -e "${BLUE}ℹ️ ${1}${NC}"
|
||||
}
|
||||
|
||||
log_success() {
|
||||
echo -e "${GREEN}✅ ${1}${NC}"
|
||||
}
|
||||
|
||||
log_warning() {
|
||||
echo -e "${YELLOW}⚠️ ${1}${NC}"
|
||||
}
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}❌ ${1}${NC}"
|
||||
}
|
||||
|
||||
log_debug() {
|
||||
if [[ "${DEBUG:-0}" == "1" ]]; then
|
||||
echo -e "${CYAN}🔍 ${1}${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
log_step() {
|
||||
echo -e "${MAGENTA}▶️ ${1}${NC}"
|
||||
}
|
||||
|
||||
# Error handling
|
||||
die() {
|
||||
log_error "$1"
|
||||
exit "${2:-1}"
|
||||
}
|
||||
|
||||
# Check if command exists
|
||||
command_exists() {
|
||||
command -v "$1" &> /dev/null
|
||||
}
|
||||
|
||||
# Validate prerequisites
|
||||
require_command() {
|
||||
local cmd="$1"
|
||||
local install_hint="${2:-}"
|
||||
|
||||
if ! command_exists "$cmd"; then
|
||||
log_error "Required command not found: $cmd"
|
||||
[[ -n "$install_hint" ]] && log_info "Install with: $install_hint"
|
||||
return 1
|
||||
fi
|
||||
return 0
|
||||
}
|
||||
|
||||
# Run command with retry logic
|
||||
run_with_retry() {
|
||||
local max_attempts="${1}"
|
||||
local delay="${2}"
|
||||
shift 2
|
||||
local cmd=("$@")
|
||||
local attempt=1
|
||||
|
||||
while [[ $attempt -le $max_attempts ]]; do
|
||||
if "${cmd[@]}"; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
if [[ $attempt -lt $max_attempts ]]; then
|
||||
log_warning "Command failed (attempt $attempt/$max_attempts). Retrying in ${delay}s..."
|
||||
sleep "$delay"
|
||||
fi
|
||||
|
||||
((attempt++))
|
||||
done
|
||||
|
||||
log_error "Command failed after $max_attempts attempts"
|
||||
return 1
|
||||
}
|
||||
|
||||
# Execute command and capture output
|
||||
execute() {
|
||||
local cmd="$1"
|
||||
log_debug "Executing: $cmd"
|
||||
eval "$cmd"
|
||||
}
|
||||
|
||||
# Spinner for long-running operations
|
||||
spinner() {
|
||||
local pid=$1
|
||||
local delay=0.1
|
||||
local spinstr='⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏'
|
||||
|
||||
while ps -p "$pid" > /dev/null 2>&1; do
|
||||
local temp=${spinstr#?}
|
||||
printf " [%c] " "$spinstr"
|
||||
local spinstr=$temp${spinstr%"$temp"}
|
||||
sleep $delay
|
||||
printf "\b\b\b\b\b\b"
|
||||
done
|
||||
printf " \b\b\b\b"
|
||||
}
|
||||
|
||||
# Progress bar
|
||||
progress_bar() {
|
||||
local current=$1
|
||||
local total=$2
|
||||
local width=50
|
||||
local percentage=$((current * 100 / total))
|
||||
local completed=$((width * current / total))
|
||||
local remaining=$((width - completed))
|
||||
|
||||
printf "\r["
|
||||
printf "%${completed}s" | tr ' ' '█'
|
||||
printf "%${remaining}s" | tr ' ' '░'
|
||||
printf "] %3d%%" "$percentage"
|
||||
|
||||
if [[ $current -eq $total ]]; then
|
||||
echo ""
|
||||
fi
|
||||
}
|
||||
|
||||
# Confirm action
|
||||
confirm() {
|
||||
local prompt="${1:-Are you sure?}"
|
||||
local default="${2:-n}"
|
||||
|
||||
if [[ "$default" == "y" ]]; then
|
||||
prompt="$prompt [Y/n] "
|
||||
else
|
||||
prompt="$prompt [y/N] "
|
||||
fi
|
||||
|
||||
read -rp "$prompt" response
|
||||
response=${response:-$default}
|
||||
|
||||
[[ "$response" =~ ^[Yy]$ ]]
|
||||
}
|
||||
|
||||
# Parse YAML-like config
|
||||
parse_config() {
|
||||
local config_file="$1"
|
||||
local key="$2"
|
||||
|
||||
if [[ ! -f "$config_file" ]]; then
|
||||
log_error "Config file not found: $config_file"
|
||||
return 1
|
||||
fi
|
||||
|
||||
grep "^${key}:" "$config_file" | sed "s/^${key}:[[:space:]]*//" | tr -d '"'
|
||||
}
|
||||
|
||||
# Timestamp functions
|
||||
timestamp() {
|
||||
date '+%Y-%m-%d %H:%M:%S'
|
||||
}
|
||||
|
||||
timestamp_file() {
|
||||
date '+%Y%m%d_%H%M%S'
|
||||
}
|
||||
|
||||
# Duration calculation
|
||||
duration() {
|
||||
local start=$1
|
||||
local end=${2:-$(date +%s)}
|
||||
local elapsed=$((end - start))
|
||||
|
||||
local hours=$((elapsed / 3600))
|
||||
local minutes=$(((elapsed % 3600) / 60))
|
||||
local seconds=$((elapsed % 60))
|
||||
|
||||
if [[ $hours -gt 0 ]]; then
|
||||
printf "%dh %dm %ds" "$hours" "$minutes" "$seconds"
|
||||
elif [[ $minutes -gt 0 ]]; then
|
||||
printf "%dm %ds" "$minutes" "$seconds"
|
||||
else
|
||||
printf "%ds" "$seconds"
|
||||
fi
|
||||
}
|
||||
|
||||
# Cleanup handler
|
||||
cleanup_handlers=()
|
||||
|
||||
register_cleanup() {
|
||||
cleanup_handlers+=("$1")
|
||||
}
|
||||
|
||||
cleanup() {
|
||||
log_info "Running cleanup handlers..."
|
||||
for handler in "${cleanup_handlers[@]}"; do
|
||||
eval "$handler" || log_warning "Cleanup handler failed: $handler"
|
||||
done
|
||||
}
|
||||
|
||||
trap cleanup EXIT
|
||||
|
||||
# Export functions for use in other scripts
|
||||
export -f log_info log_success log_warning log_error log_debug log_step
|
||||
export -f die command_exists require_command run_with_retry execute
|
||||
export -f spinner progress_bar confirm parse_config
|
||||
export -f timestamp timestamp_file duration
|
||||
export -f register_cleanup cleanup
|
||||
184
.deployment-archive-20251030-111806/scripts/manual-deploy-fallback.sh
Executable file
184
.deployment-archive-20251030-111806/scripts/manual-deploy-fallback.sh
Executable file
@@ -0,0 +1,184 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Manual Deployment Fallback Script
|
||||
# Purpose: Deploy manually when Gitea Actions is unavailable
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/manual-deploy-fallback.sh [branch] # Deploy specific branch
|
||||
# ./scripts/manual-deploy-fallback.sh # Deploy current branch
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
ANSIBLE_DIR="${PROJECT_ROOT}/deployment/ansible"
|
||||
INVENTORY="${ANSIBLE_DIR}/inventory/production.yml"
|
||||
|
||||
PRODUCTION_SERVER="94.16.110.151"
|
||||
REGISTRY="git.michaelschiemer.de:5000"
|
||||
IMAGE="framework"
|
||||
BRANCH="${1:-$(git rev-parse --abbrev-ref HEAD)}"
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[ERROR]${NC} $1" >&2
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[WARN]${NC} $1"
|
||||
}
|
||||
|
||||
log_info() {
|
||||
echo -e "${GREEN}[INFO]${NC} $1"
|
||||
}
|
||||
|
||||
log_step() {
|
||||
echo -e "${BLUE}[STEP]${NC} $1"
|
||||
}
|
||||
|
||||
# Check prerequisites
|
||||
check_prerequisites() {
|
||||
log_step "Checking prerequisites..."
|
||||
|
||||
# Check if git is clean
|
||||
if [[ -n $(git status --porcelain) ]]; then
|
||||
log_error "Git working directory is not clean. Commit or stash changes first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check if ansible is installed
|
||||
if ! command -v ansible-playbook &> /dev/null; then
|
||||
log_error "ansible-playbook not found. Install Ansible first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check if docker is available
|
||||
if ! command -v docker &> /dev/null; then
|
||||
log_error "docker not found. Install Docker first."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check SSH access to production server
|
||||
if ! ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" "echo 'SSH OK'" &> /dev/null; then
|
||||
log_error "Cannot SSH to production server. Check your SSH key."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
log_info "Prerequisites check passed"
|
||||
}
|
||||
|
||||
# Build Docker image locally
|
||||
build_image() {
|
||||
log_step "Building Docker image for branch: ${BRANCH}"
|
||||
|
||||
cd "${PROJECT_ROOT}"
|
||||
|
||||
# Checkout branch
|
||||
git checkout "${BRANCH}"
|
||||
git pull origin "${BRANCH}"
|
||||
|
||||
# Get commit SHA
|
||||
COMMIT_SHA=$(git rev-parse --short HEAD)
|
||||
IMAGE_TAG="${COMMIT_SHA}-$(date +%s)"
|
||||
|
||||
log_info "Building image with tag: ${IMAGE_TAG}"
|
||||
|
||||
# Build image
|
||||
docker build \
|
||||
--file Dockerfile.production \
|
||||
--tag "${REGISTRY}/${IMAGE}:${IMAGE_TAG}" \
|
||||
--tag "${REGISTRY}/${IMAGE}:latest" \
|
||||
--build-arg BUILD_DATE="$(date -u +'%Y-%m-%dT%H:%M:%SZ')" \
|
||||
--build-arg VCS_REF="${COMMIT_SHA}" \
|
||||
.
|
||||
|
||||
log_info "Image built successfully"
|
||||
}
|
||||
|
||||
# Push image to registry
|
||||
push_image() {
|
||||
log_step "Pushing image to registry..."
|
||||
|
||||
# Login to registry (prompt for password if needed)
|
||||
log_info "Logging in to registry..."
|
||||
docker login "${REGISTRY}"
|
||||
|
||||
# Push image
|
||||
docker push "${REGISTRY}/${IMAGE}:${IMAGE_TAG}"
|
||||
docker push "${REGISTRY}/${IMAGE}:latest"
|
||||
|
||||
log_info "Image pushed successfully"
|
||||
}
|
||||
|
||||
# Deploy via Ansible
|
||||
deploy_ansible() {
|
||||
log_step "Deploying via Ansible..."
|
||||
|
||||
cd "${ANSIBLE_DIR}"
|
||||
|
||||
ansible-playbook \
|
||||
-i "${INVENTORY}" \
|
||||
playbooks/deploy-update.yml \
|
||||
-e "image_tag=${IMAGE_TAG}" \
|
||||
-e "git_commit_sha=${COMMIT_SHA}"
|
||||
|
||||
log_info "Ansible deployment completed"
|
||||
}
|
||||
|
||||
# Run health checks
|
||||
run_health_checks() {
|
||||
log_step "Running health checks..."
|
||||
|
||||
cd "${ANSIBLE_DIR}"
|
||||
|
||||
ansible-playbook \
|
||||
-i "${INVENTORY}" \
|
||||
playbooks/health-check.yml
|
||||
|
||||
log_info "Health checks passed"
|
||||
}
|
||||
|
||||
# Main deployment flow
|
||||
main() {
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ MANUAL DEPLOYMENT FALLBACK (No Gitea Actions) ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
|
||||
log_info "Branch: ${BRANCH}"
|
||||
echo ""
|
||||
|
||||
read -p "Continue with manual deployment? (yes/no): " -r
|
||||
if [[ ! "$REPLY" =~ ^[Yy][Ee][Ss]$ ]]; then
|
||||
log_info "Deployment cancelled"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
check_prerequisites
|
||||
build_image
|
||||
push_image
|
||||
deploy_ansible
|
||||
run_health_checks
|
||||
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ MANUAL DEPLOYMENT COMPLETED ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
log_info "Deployed: ${REGISTRY}/${IMAGE}:${IMAGE_TAG}"
|
||||
log_info "Commit: ${COMMIT_SHA}"
|
||||
log_info "Branch: ${BRANCH}"
|
||||
echo ""
|
||||
log_info "Verify deployment: https://michaelschiemer.de"
|
||||
echo ""
|
||||
}
|
||||
|
||||
main "$@"
|
||||
230
.deployment-archive-20251030-111806/scripts/service-recovery.sh
Executable file
230
.deployment-archive-20251030-111806/scripts/service-recovery.sh
Executable file
@@ -0,0 +1,230 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Service Recovery Script
|
||||
# Purpose: Quick recovery for common service failures
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/service-recovery.sh status # Check service status
|
||||
# ./scripts/service-recovery.sh restart # Restart services
|
||||
# ./scripts/service-recovery.sh recover # Full recovery procedure
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
|
||||
PRODUCTION_SERVER="94.16.110.151"
|
||||
STACK_NAME="framework"
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[ERROR]${NC} $1" >&2
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[WARN]${NC} $1"
|
||||
}
|
||||
|
||||
log_info() {
|
||||
echo -e "${GREEN}[INFO]${NC} $1"
|
||||
}
|
||||
|
||||
log_step() {
|
||||
echo -e "${BLUE}[STEP]${NC} $1"
|
||||
}
|
||||
|
||||
# SSH helper
|
||||
ssh_exec() {
|
||||
ssh -i ~/.ssh/production deploy@"${PRODUCTION_SERVER}" "$@"
|
||||
}
|
||||
|
||||
# Check service status
|
||||
check_status() {
|
||||
log_step "Checking service status..."
|
||||
|
||||
echo ""
|
||||
log_info "Docker Swarm Services:"
|
||||
ssh_exec "docker service ls --filter 'name=${STACK_NAME}'"
|
||||
|
||||
echo ""
|
||||
log_info "Web Service Details:"
|
||||
ssh_exec "docker service ps ${STACK_NAME}_web --no-trunc"
|
||||
|
||||
echo ""
|
||||
log_info "Queue Worker Details:"
|
||||
ssh_exec "docker service ps ${STACK_NAME}_queue-worker --no-trunc"
|
||||
|
||||
echo ""
|
||||
log_info "Service Logs (last 50 lines):"
|
||||
ssh_exec "docker service logs ${STACK_NAME}_web --tail 50"
|
||||
}
|
||||
|
||||
# Restart services
|
||||
restart_services() {
|
||||
log_step "Restarting services..."
|
||||
|
||||
echo ""
|
||||
log_warn "This will restart all framework services"
|
||||
read -p "Continue? (yes/no): " -r
|
||||
|
||||
if [[ ! "$REPLY" =~ ^[Yy][Ee][Ss]$ ]]; then
|
||||
log_info "Restart cancelled"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Restart web service
|
||||
log_info "Restarting web service..."
|
||||
ssh_exec "docker service update --force ${STACK_NAME}_web"
|
||||
|
||||
# Restart worker service
|
||||
log_info "Restarting queue worker..."
|
||||
ssh_exec "docker service update --force ${STACK_NAME}_queue-worker"
|
||||
|
||||
# Wait for services to stabilize
|
||||
log_info "Waiting for services to stabilize (30 seconds)..."
|
||||
sleep 30
|
||||
|
||||
# Check status
|
||||
check_status
|
||||
}
|
||||
|
||||
# Full recovery procedure
|
||||
full_recovery() {
|
||||
log_step "Running full recovery procedure..."
|
||||
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ FULL SERVICE RECOVERY PROCEDURE ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
|
||||
# Step 1: Check current status
|
||||
log_info "Step 1/5: Check current status"
|
||||
check_status
|
||||
|
||||
# Step 2: Check Docker Swarm health
|
||||
log_info "Step 2/5: Check Docker Swarm health"
|
||||
SWARM_STATUS=$(ssh_exec "docker info | grep 'Swarm: active' || echo 'inactive'")
|
||||
|
||||
if [[ "$SWARM_STATUS" == "inactive" ]]; then
|
||||
log_error "Docker Swarm is not active!"
|
||||
log_info "Attempting to reinitialize Swarm..."
|
||||
ssh_exec "docker swarm init --advertise-addr ${PRODUCTION_SERVER}" || true
|
||||
else
|
||||
log_info "Docker Swarm is active"
|
||||
fi
|
||||
|
||||
# Step 3: Verify network and volumes
|
||||
log_info "Step 3/5: Verify Docker resources"
|
||||
ssh_exec "docker network ls | grep ${STACK_NAME} || docker network create --driver overlay ${STACK_NAME}_network"
|
||||
|
||||
# Step 4: Restart services
|
||||
log_info "Step 4/5: Restart services"
|
||||
ssh_exec "docker service update --force ${STACK_NAME}_web"
|
||||
ssh_exec "docker service update --force ${STACK_NAME}_queue-worker"
|
||||
|
||||
log_info "Waiting for services to stabilize (45 seconds)..."
|
||||
sleep 45
|
||||
|
||||
# Step 5: Health check
|
||||
log_info "Step 5/5: Run health checks"
|
||||
|
||||
HEALTH_CHECK=$(curl -f -k https://michaelschiemer.de/health 2>/dev/null && echo "OK" || echo "FAILED")
|
||||
|
||||
if [[ "$HEALTH_CHECK" == "OK" ]]; then
|
||||
log_info "✅ Health check passed"
|
||||
else
|
||||
log_error "❌ Health check failed"
|
||||
log_warn "Manual intervention may be required"
|
||||
log_warn "Check logs: ssh deploy@${PRODUCTION_SERVER} 'docker service logs ${STACK_NAME}_web --tail 100'"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo ""
|
||||
log_warn "╔════════════════════════════════════════════════════════╗"
|
||||
log_warn "║ RECOVERY PROCEDURE COMPLETED ║"
|
||||
log_warn "╚════════════════════════════════════════════════════════╝"
|
||||
echo ""
|
||||
log_info "Application: https://michaelschiemer.de"
|
||||
log_info "Services recovered successfully"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Clear caches
|
||||
clear_caches() {
|
||||
log_step "Clearing application caches..."
|
||||
|
||||
# Clear Redis cache
|
||||
log_info "Clearing Redis cache..."
|
||||
ssh_exec "docker exec \$(docker ps -q -f name=${STACK_NAME}_redis) redis-cli FLUSHALL" || log_warn "Redis cache clear failed"
|
||||
|
||||
# Clear file caches
|
||||
log_info "Clearing file caches..."
|
||||
ssh_exec "docker exec \$(docker ps -q -f name=${STACK_NAME}_web | head -1) rm -rf /var/www/html/storage/cache/*" || log_warn "File cache clear failed"
|
||||
|
||||
log_info "Caches cleared"
|
||||
}
|
||||
|
||||
# Show help
|
||||
show_help() {
|
||||
cat <<EOF
|
||||
Service Recovery Script
|
||||
|
||||
Usage: $0 [command]
|
||||
|
||||
Commands:
|
||||
status Check service status and logs
|
||||
restart Restart all services
|
||||
recover Run full recovery procedure (recommended)
|
||||
clear-cache Clear application caches
|
||||
help Show this help
|
||||
|
||||
Examples:
|
||||
$0 status # Quick status check
|
||||
$0 recover # Full automated recovery
|
||||
$0 restart # Just restart services
|
||||
$0 clear-cache # Clear caches only
|
||||
|
||||
Emergency Recovery:
|
||||
1. Check status: $0 status
|
||||
2. Run recovery: $0 recover
|
||||
3. If still failing, check logs manually:
|
||||
ssh deploy@${PRODUCTION_SERVER} 'docker service logs ${STACK_NAME}_web --tail 200'
|
||||
|
||||
EOF
|
||||
}
|
||||
|
||||
# Main
|
||||
main() {
|
||||
case "${1:-help}" in
|
||||
status)
|
||||
check_status
|
||||
;;
|
||||
restart)
|
||||
restart_services
|
||||
;;
|
||||
recover)
|
||||
full_recovery
|
||||
;;
|
||||
clear-cache)
|
||||
clear_caches
|
||||
;;
|
||||
help|--help|-h)
|
||||
show_help
|
||||
;;
|
||||
*)
|
||||
log_error "Unknown command: $1"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
main "$@"
|
||||
262
.deployment-archive-20251030-111806/scripts/setup-production-secrets.sh
Executable file
262
.deployment-archive-20251030-111806/scripts/setup-production-secrets.sh
Executable file
@@ -0,0 +1,262 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Production Secrets Setup Script
|
||||
# Purpose: Initialize and manage production secrets with Ansible Vault
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/setup-production-secrets.sh init # Initialize new vault
|
||||
# ./scripts/setup-production-secrets.sh deploy # Deploy secrets to production
|
||||
# ./scripts/setup-production-secrets.sh rotate # Rotate secrets
|
||||
# ./scripts/setup-production-secrets.sh verify # Verify secrets on server
|
||||
#
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||
ANSIBLE_DIR="${PROJECT_ROOT}/deployment/ansible"
|
||||
VAULT_FILE="${ANSIBLE_DIR}/secrets/production-vault.yml"
|
||||
INVENTORY="${ANSIBLE_DIR}/inventory/production.yml"
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Logging functions
|
||||
log_info() {
|
||||
echo -e "${GREEN}[INFO]${NC} $1"
|
||||
}
|
||||
|
||||
log_warn() {
|
||||
echo -e "${YELLOW}[WARN]${NC} $1"
|
||||
}
|
||||
|
||||
log_error() {
|
||||
echo -e "${RED}[ERROR]${NC} $1"
|
||||
}
|
||||
|
||||
# Check prerequisites
|
||||
check_prerequisites() {
|
||||
log_info "Checking prerequisites..."
|
||||
|
||||
if ! command -v ansible-vault &> /dev/null; then
|
||||
log_error "ansible-vault not found. Please install Ansible."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! command -v openssl &> /dev/null; then
|
||||
log_error "openssl not found. Please install OpenSSL."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
log_info "Prerequisites OK"
|
||||
}
|
||||
|
||||
# Generate secure random password
|
||||
generate_password() {
|
||||
local length="${1:-32}"
|
||||
openssl rand -base64 "$length" | tr -d "=+/" | cut -c1-"$length"
|
||||
}
|
||||
|
||||
# Generate base64 encoded app key
|
||||
generate_app_key() {
|
||||
openssl rand -base64 32
|
||||
}
|
||||
|
||||
# Initialize vault with secure defaults
|
||||
init_vault() {
|
||||
log_info "Initializing production secrets vault..."
|
||||
|
||||
if [[ -f "$VAULT_FILE" ]]; then
|
||||
log_warn "Vault file already exists: $VAULT_FILE"
|
||||
read -p "Do you want to overwrite it? (yes/no): " -r
|
||||
if [[ ! $REPLY =~ ^[Yy]es$ ]]; then
|
||||
log_info "Aborting initialization"
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
|
||||
# Generate secure secrets
|
||||
log_info "Generating secure secrets..."
|
||||
DB_PASSWORD=$(generate_password 32)
|
||||
REDIS_PASSWORD=$(generate_password 32)
|
||||
APP_KEY=$(generate_app_key)
|
||||
JWT_SECRET=$(generate_password 64)
|
||||
REGISTRY_PASSWORD=$(generate_password 24)
|
||||
|
||||
# Create vault file
|
||||
cat > "$VAULT_FILE" <<EOF
|
||||
---
|
||||
# Production Secrets Vault
|
||||
# Generated: $(date -u +"%Y-%m-%d %H:%M:%S UTC")
|
||||
|
||||
# Database Credentials
|
||||
vault_db_name: framework_production
|
||||
vault_db_user: framework_app
|
||||
vault_db_password: ${DB_PASSWORD}
|
||||
|
||||
# Redis Credentials
|
||||
vault_redis_password: ${REDIS_PASSWORD}
|
||||
|
||||
# Application Secrets
|
||||
vault_app_key: ${APP_KEY}
|
||||
vault_jwt_secret: ${JWT_SECRET}
|
||||
|
||||
# Docker Registry Credentials
|
||||
vault_registry_url: git.michaelschiemer.de:5000
|
||||
vault_registry_user: deploy
|
||||
vault_registry_password: ${REGISTRY_PASSWORD}
|
||||
|
||||
# Security Configuration
|
||||
vault_admin_allowed_ips: "127.0.0.1,::1,94.16.110.151"
|
||||
|
||||
# SMTP Configuration (update these manually)
|
||||
vault_smtp_host: smtp.example.com
|
||||
vault_smtp_port: 587
|
||||
vault_smtp_user: noreply@michaelschiemer.de
|
||||
vault_smtp_password: CHANGE_ME_SMTP_PASSWORD_HERE
|
||||
EOF
|
||||
|
||||
log_info "Vault file created with generated secrets"
|
||||
log_warn "IMPORTANT: Update SMTP credentials manually if needed"
|
||||
|
||||
# Encrypt vault
|
||||
log_info "Encrypting vault file..."
|
||||
ansible-vault encrypt "$VAULT_FILE"
|
||||
|
||||
log_info "✅ Vault initialized successfully"
|
||||
log_warn "Store the vault password securely (e.g., in password manager)"
|
||||
}
|
||||
|
||||
# Deploy secrets to production
|
||||
deploy_secrets() {
|
||||
log_info "Deploying secrets to production..."
|
||||
|
||||
if [[ ! -f "$VAULT_FILE" ]]; then
|
||||
log_error "Vault file not found: $VAULT_FILE"
|
||||
log_error "Run './setup-production-secrets.sh init' first"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cd "$ANSIBLE_DIR"
|
||||
|
||||
log_info "Running Ansible playbook..."
|
||||
ansible-playbook \
|
||||
-i "$INVENTORY" \
|
||||
playbooks/setup-production-secrets.yml \
|
||||
--ask-vault-pass
|
||||
|
||||
log_info "✅ Secrets deployed successfully"
|
||||
}
|
||||
|
||||
# Rotate secrets (regenerate and redeploy)
|
||||
rotate_secrets() {
|
||||
log_warn "⚠️ Secret rotation will:"
|
||||
log_warn " 1. Generate new passwords/keys"
|
||||
log_warn " 2. Update vault file"
|
||||
log_warn " 3. Deploy to production"
|
||||
log_warn " 4. Restart services"
|
||||
log_warn ""
|
||||
read -p "Continue with rotation? (yes/no): " -r
|
||||
|
||||
if [[ ! $REPLY =~ ^[Yy]es$ ]]; then
|
||||
log_info "Rotation cancelled"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Backup current vault
|
||||
BACKUP_FILE="${VAULT_FILE}.backup.$(date +%Y%m%d_%H%M%S)"
|
||||
log_info "Creating backup: $BACKUP_FILE"
|
||||
cp "$VAULT_FILE" "$BACKUP_FILE"
|
||||
|
||||
# Decrypt vault
|
||||
log_info "Decrypting vault..."
|
||||
ansible-vault decrypt "$VAULT_FILE"
|
||||
|
||||
# Generate new secrets
|
||||
log_info "Generating new secrets..."
|
||||
DB_PASSWORD=$(generate_password 32)
|
||||
REDIS_PASSWORD=$(generate_password 32)
|
||||
APP_KEY=$(generate_app_key)
|
||||
JWT_SECRET=$(generate_password 64)
|
||||
|
||||
# Update vault file (keep registry password)
|
||||
sed -i "s/vault_db_password: .*/vault_db_password: ${DB_PASSWORD}/" "$VAULT_FILE"
|
||||
sed -i "s/vault_redis_password: .*/vault_redis_password: ${REDIS_PASSWORD}/" "$VAULT_FILE"
|
||||
sed -i "s/vault_app_key: .*/vault_app_key: ${APP_KEY}/" "$VAULT_FILE"
|
||||
sed -i "s/vault_jwt_secret: .*/vault_jwt_secret: ${JWT_SECRET}/" "$VAULT_FILE"
|
||||
|
||||
# Re-encrypt vault
|
||||
log_info "Re-encrypting vault..."
|
||||
ansible-vault encrypt "$VAULT_FILE"
|
||||
|
||||
log_info "✅ Secrets rotated"
|
||||
log_info "Backup saved to: $BACKUP_FILE"
|
||||
|
||||
# Deploy rotated secrets
|
||||
deploy_secrets
|
||||
}
|
||||
|
||||
# Verify secrets on server
|
||||
verify_secrets() {
|
||||
log_info "Verifying secrets on production server..."
|
||||
|
||||
cd "$ANSIBLE_DIR"
|
||||
|
||||
ansible production_server \
|
||||
-i "$INVENTORY" \
|
||||
-m shell \
|
||||
-a "docker secret ls"
|
||||
|
||||
log_info "Checking environment file..."
|
||||
ansible production_server \
|
||||
-i "$INVENTORY" \
|
||||
-m stat \
|
||||
-a "path=/home/deploy/secrets/.env.production"
|
||||
|
||||
log_info "✅ Verification complete"
|
||||
}
|
||||
|
||||
# Main command dispatcher
|
||||
main() {
|
||||
check_prerequisites
|
||||
|
||||
case "${1:-help}" in
|
||||
init)
|
||||
init_vault
|
||||
;;
|
||||
deploy)
|
||||
deploy_secrets
|
||||
;;
|
||||
rotate)
|
||||
rotate_secrets
|
||||
;;
|
||||
verify)
|
||||
verify_secrets
|
||||
;;
|
||||
help|*)
|
||||
cat <<EOF
|
||||
Production Secrets Management
|
||||
|
||||
Usage: $0 <command>
|
||||
|
||||
Commands:
|
||||
init Initialize new secrets vault with auto-generated secure values
|
||||
deploy Deploy secrets from vault to production server
|
||||
rotate Rotate secrets (generate new values and redeploy)
|
||||
verify Verify secrets are properly deployed on server
|
||||
|
||||
Examples:
|
||||
$0 init # First time setup
|
||||
$0 deploy # Deploy after manual vault updates
|
||||
$0 rotate # Monthly security rotation
|
||||
$0 verify # Check deployment status
|
||||
|
||||
EOF
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
main "$@"
|
||||
Reference in New Issue
Block a user