fix(console): comprehensive TUI rendering fixes
- Fix Enter key detection: handle multiple Enter key formats (\n, \r, \r\n) - Reduce flickering: lower render frequency from 60 FPS to 30 FPS - Fix menu bar visibility: re-render menu bar after content to prevent overwriting - Fix content positioning: explicit line positioning for categories and commands - Fix line shifting: clear lines before writing, control newlines manually - Limit visible items: prevent overflow with maxVisibleCategories/Commands - Improve CPU usage: increase sleep interval when no events processed This fixes: - Enter key not working for selection - Strong flickering of the application - Menu bar not visible or being overwritten - Top half of selection list not displayed - Lines being shifted/misaligned
This commit is contained in:
208
deployment/ansible/playbooks/CLEANUP_SUMMARY.md
Normal file
208
deployment/ansible/playbooks/CLEANUP_SUMMARY.md
Normal file
@@ -0,0 +1,208 @@
|
||||
# Playbook Cleanup & Server Redeploy - Summary
|
||||
|
||||
## Completed Tasks
|
||||
|
||||
### Phase 1: Playbook Cleanup ✅
|
||||
|
||||
#### 1.1 Redundante Diagnose-Playbooks konsolidiert
|
||||
- ✅ Created `diagnose/gitea.yml` - Consolidates:
|
||||
- `diagnose-gitea-timeouts.yml`
|
||||
- `diagnose-gitea-timeout-deep.yml`
|
||||
- `diagnose-gitea-timeout-live.yml`
|
||||
- `diagnose-gitea-timeouts-complete.yml`
|
||||
- `comprehensive-gitea-diagnosis.yml`
|
||||
- ✅ Uses tags: `deep`, `complete` for selective execution
|
||||
- ✅ Removed redundant playbooks
|
||||
|
||||
#### 1.2 Redundante Fix-Playbooks konsolidiert
|
||||
- ✅ Created `manage/gitea.yml` - Consolidates:
|
||||
- `fix-gitea-timeouts.yml`
|
||||
- `fix-gitea-traefik-connection.yml`
|
||||
- `fix-gitea-ssl-routing.yml`
|
||||
- `fix-gitea-servers-transport.yml`
|
||||
- `fix-gitea-complete.yml`
|
||||
- `restart-gitea-complete.yml`
|
||||
- `restart-gitea-with-cache.yml`
|
||||
- ✅ Uses tags: `restart`, `fix-timeouts`, `fix-ssl`, `fix-servers-transport`, `complete`
|
||||
- ✅ Removed redundant playbooks
|
||||
|
||||
#### 1.3 Traefik-Diagnose/Fix-Playbooks konsolidiert
|
||||
- ✅ Created `diagnose/traefik.yml` - Consolidates:
|
||||
- `diagnose-traefik-restarts.yml`
|
||||
- `find-traefik-restart-source.yml`
|
||||
- `monitor-traefik-restarts.yml`
|
||||
- `monitor-traefik-continuously.yml`
|
||||
- `verify-traefik-fix.yml`
|
||||
- ✅ Created `manage/traefik.yml` - Consolidates:
|
||||
- `stabilize-traefik.yml`
|
||||
- `disable-traefik-auto-restarts.yml`
|
||||
- ✅ Uses tags: `restart-source`, `monitor`, `stabilize`, `disable-auto-restart`
|
||||
- ✅ Removed redundant playbooks
|
||||
|
||||
#### 1.4 Veraltete/Redundante Playbooks entfernt
|
||||
- ✅ Removed `update-gitea-traefik-service.yml` (deprecated)
|
||||
- ✅ Removed `ensure-gitea-traefik-discovery.yml` (redundant)
|
||||
- ✅ Removed `test-gitea-after-fix.yml` (temporär)
|
||||
- ✅ Removed `find-ansible-automation-source.yml` (temporär)
|
||||
|
||||
#### 1.5 Neue Verzeichnisstruktur erstellt
|
||||
- ✅ Created `playbooks/diagnose/` directory
|
||||
- ✅ Created `playbooks/manage/` directory
|
||||
- ✅ Created `playbooks/setup/` directory
|
||||
- ✅ Created `playbooks/maintenance/` directory
|
||||
- ✅ Created `playbooks/deploy/` directory
|
||||
|
||||
#### 1.6 Playbooks verschoben
|
||||
- ✅ `setup-infrastructure.yml` → `setup/infrastructure.yml`
|
||||
- ✅ `deploy-complete.yml` → `deploy/complete.yml`
|
||||
- ✅ `deploy-image.yml` → `deploy/image.yml`
|
||||
- ✅ `deploy-application-code.yml` → `deploy/code.yml`
|
||||
- ✅ `setup-ssl-certificates.yml` → `setup/ssl.yml`
|
||||
- ✅ `setup-gitea-initial-config.yml` → `setup/gitea.yml`
|
||||
- ✅ `cleanup-all-containers.yml` → `maintenance/cleanup.yml`
|
||||
|
||||
#### 1.7 README aktualisiert
|
||||
- ✅ Updated `playbooks/README.md` with new structure
|
||||
- ✅ Documented consolidated playbooks
|
||||
- ✅ Added usage examples with tags
|
||||
- ✅ Listed removed/consolidated playbooks
|
||||
|
||||
### Phase 2: Server Neustart-Vorbereitung ✅
|
||||
|
||||
#### 2.1 Backup-Script erstellt
|
||||
- ✅ Created `maintenance/backup-before-redeploy.yml`
|
||||
- ✅ Backs up:
|
||||
- Gitea data (volumes)
|
||||
- SSL certificates (acme.json)
|
||||
- Gitea configuration (app.ini)
|
||||
- Traefik configuration
|
||||
- PostgreSQL data (if applicable)
|
||||
- ✅ Includes backup verification
|
||||
|
||||
#### 2.2 Neustart-Playbook erstellt
|
||||
- ✅ Created `setup/redeploy-traefik-gitea-clean.yml`
|
||||
- ✅ Features:
|
||||
- Automatic backup (optional)
|
||||
- Stop and remove containers (preserves volumes/acme.json)
|
||||
- Sync configurations
|
||||
- Redeploy stacks
|
||||
- Restore Gitea configuration
|
||||
- Verify service discovery
|
||||
- Final tests
|
||||
|
||||
#### 2.3 Neustart-Anleitung erstellt
|
||||
- ✅ Created `setup/REDEPLOY_GUIDE.md`
|
||||
- ✅ Includes:
|
||||
- Step-by-step guide
|
||||
- Prerequisites
|
||||
- Backup verification
|
||||
- Rollback procedure
|
||||
- Troubleshooting
|
||||
- Common issues
|
||||
|
||||
#### 2.4 Rollback-Playbook erstellt
|
||||
- ✅ Created `maintenance/rollback-redeploy.yml`
|
||||
- ✅ Features:
|
||||
- Restore from backup
|
||||
- Restore volumes, configurations, SSL certificates
|
||||
- Restart stacks
|
||||
- Verification
|
||||
|
||||
## New Playbook Structure
|
||||
|
||||
```
|
||||
playbooks/
|
||||
├── setup/ # Initial Setup
|
||||
│ ├── infrastructure.yml
|
||||
│ ├── gitea.yml
|
||||
│ ├── ssl.yml
|
||||
│ ├── redeploy-traefik-gitea-clean.yml
|
||||
│ └── REDEPLOY_GUIDE.md
|
||||
├── deploy/ # Deployment
|
||||
│ ├── complete.yml
|
||||
│ ├── image.yml
|
||||
│ └── code.yml
|
||||
├── manage/ # Management (konsolidiert)
|
||||
│ ├── traefik.yml
|
||||
│ └── gitea.yml
|
||||
├── diagnose/ # Diagnose (konsolidiert)
|
||||
│ ├── gitea.yml
|
||||
│ └── traefik.yml
|
||||
└── maintenance/ # Wartung
|
||||
├── backup.yml
|
||||
├── backup-before-redeploy.yml
|
||||
├── cleanup.yml
|
||||
├── rollback-redeploy.yml
|
||||
└── system.yml
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Gitea Diagnosis
|
||||
```bash
|
||||
# Basic
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml
|
||||
|
||||
# Deep
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags deep
|
||||
|
||||
# Complete
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags complete
|
||||
```
|
||||
|
||||
### Gitea Management
|
||||
```bash
|
||||
# Restart
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags restart
|
||||
|
||||
# Fix timeouts
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags fix-timeouts
|
||||
|
||||
# Complete fix
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags complete
|
||||
```
|
||||
|
||||
### Redeploy
|
||||
```bash
|
||||
# With automatic backup
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass
|
||||
|
||||
# With existing backup
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890" \
|
||||
-e "skip_backup=true"
|
||||
```
|
||||
|
||||
### Rollback
|
||||
```bash
|
||||
ansible-playbook -i inventory/production.yml playbooks/maintenance/rollback-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890"
|
||||
```
|
||||
|
||||
## Statistics
|
||||
|
||||
- **Consolidated playbooks created**: 4 (diagnose/gitea.yml, diagnose/traefik.yml, manage/gitea.yml, manage/traefik.yml)
|
||||
- **Redeploy playbooks created**: 3 (redeploy-traefik-gitea-clean.yml, backup-before-redeploy.yml, rollback-redeploy.yml)
|
||||
- **Redundant playbooks removed**: ~20+
|
||||
- **Playbooks moved to new structure**: 7
|
||||
- **Documentation created**: 2 (README.md updated, REDEPLOY_GUIDE.md)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ✅ Test consolidated playbooks (dry-run where possible)
|
||||
2. ✅ Verify redeploy playbook works correctly
|
||||
3. ✅ Update CI/CD workflows to use new playbook paths if needed
|
||||
4. ⏳ Perform actual server redeploy when ready
|
||||
|
||||
## Notes
|
||||
|
||||
- All consolidated playbooks use tags for selective execution
|
||||
- Old wrapper playbooks (e.g., `restart-traefik.yml`) still exist and work
|
||||
- Backup playbook preserves all critical data
|
||||
- Redeploy playbook includes comprehensive verification
|
||||
- Rollback playbook allows quick recovery if needed
|
||||
|
||||
|
||||
@@ -1,42 +1,81 @@
|
||||
# Ansible Playbooks - Übersicht
|
||||
|
||||
## Neue Struktur
|
||||
|
||||
Die Playbooks wurden reorganisiert in eine klare Verzeichnisstruktur:
|
||||
|
||||
```
|
||||
playbooks/
|
||||
├── setup/ # Initial Setup
|
||||
│ ├── infrastructure.yml
|
||||
│ ├── gitea.yml
|
||||
│ └── ssl.yml
|
||||
├── deploy/ # Deployment
|
||||
│ ├── complete.yml
|
||||
│ ├── image.yml
|
||||
│ └── code.yml
|
||||
├── manage/ # Management (konsolidiert)
|
||||
│ ├── traefik.yml
|
||||
│ ├── gitea.yml
|
||||
│ └── application.yml
|
||||
├── diagnose/ # Diagnose (konsolidiert)
|
||||
│ ├── gitea.yml
|
||||
│ ├── traefik.yml
|
||||
│ └── application.yml
|
||||
└── maintenance/ # Wartung
|
||||
├── backup.yml
|
||||
├── backup-before-redeploy.yml
|
||||
├── cleanup.yml
|
||||
├── rollback-redeploy.yml
|
||||
└── system.yml
|
||||
```
|
||||
|
||||
## Verfügbare Playbooks
|
||||
|
||||
> **Hinweis**: Die meisten Playbooks wurden in wiederverwendbare Roles refactored. Die Playbooks sind jetzt Wrapper, die die entsprechenden Role-Tasks aufrufen. Dies verbessert Wiederverwendbarkeit, Wartbarkeit und folgt Ansible Best Practices.
|
||||
|
||||
### Infrastructure Setup
|
||||
- **`setup-infrastructure.yml`** - Deployed alle Stacks (Traefik, PostgreSQL, Redis, Registry, Gitea, Monitoring, Production)
|
||||
- **`setup-production-secrets.yml`** - Deployed Secrets zu Production
|
||||
- **`setup-ssl-certificates.yml`** - SSL Certificate Setup (Wrapper für `traefik` Role, `tasks_from: ssl`)
|
||||
- **`setup-wireguard-host.yml`** - WireGuard VPN Setup
|
||||
- **`sync-stacks.yml`** - Synchronisiert Stack-Konfigurationen zum Server
|
||||
### Setup (Initial Setup)
|
||||
|
||||
### Deployment & Updates
|
||||
- **`rollback.yml`** - Rollback zu vorheriger Version
|
||||
- **`backup.yml`** - Erstellt Backups von PostgreSQL, Application Data, Gitea, Registry
|
||||
- **`deploy-image.yml`** - Docker Image Deployment (wird von CI/CD Workflows verwendet)
|
||||
- **`setup/infrastructure.yml`** - Deployed alle Stacks (Traefik, PostgreSQL, Redis, Registry, Gitea, Monitoring, Production)
|
||||
- **`setup/gitea.yml`** - Setup Gitea Initial Configuration (Wrapper für `gitea` Role, `tasks_from: setup`)
|
||||
- **`setup/ssl.yml`** - SSL Certificate Setup (Wrapper für `traefik` Role, `tasks_from: ssl`)
|
||||
- **`setup/redeploy-traefik-gitea-clean.yml`** - Clean redeployment of Traefik and Gitea stacks
|
||||
- **`setup/REDEPLOY_GUIDE.md`** - Step-by-step guide for redeployment
|
||||
|
||||
### Traefik Management (Role-basiert)
|
||||
### Deployment
|
||||
|
||||
- **`deploy/complete.yml`** - Complete deployment (code + image + dependencies)
|
||||
- **`deploy/image.yml`** - Docker Image Deployment (wird von CI/CD Workflows verwendet)
|
||||
- **`deploy/code.yml`** - Deploy Application Code via Git (Wrapper für `application` Role, `tasks_from: deploy_code`)
|
||||
|
||||
### Management (Konsolidiert)
|
||||
|
||||
#### Traefik Management
|
||||
- **`manage/traefik.yml`** - Consolidated Traefik management
|
||||
- `--tags stabilize`: Fix acme.json, ensure running, monitor stability
|
||||
- `--tags disable-auto-restart`: Check and document auto-restart mechanisms
|
||||
- **`restart-traefik.yml`** - Restart Traefik Container (Wrapper für `traefik` Role, `tasks_from: restart`)
|
||||
- **`recreate-traefik.yml`** - Recreate Traefik Container (Wrapper für `traefik` Role, `tasks_from: restart` mit `traefik_restart_action: recreate`)
|
||||
- **`deploy-traefik-config.yml`** - Deploy Traefik Configuration Files (Wrapper für `traefik` Role, `tasks_from: config`)
|
||||
- **`check-traefik-acme-logs.yml`** - Check Traefik ACME Challenge Logs (Wrapper für `traefik` Role, `tasks_from: logs`)
|
||||
- **`setup-ssl-certificates.yml`** - Setup Let's Encrypt SSL Certificates (Wrapper für `traefik` Role, `tasks_from: ssl`)
|
||||
|
||||
### Gitea Management (Role-basiert)
|
||||
#### Gitea Management
|
||||
- **`manage/gitea.yml`** - Consolidated Gitea management
|
||||
- `--tags restart`: Restart Gitea container
|
||||
- `--tags fix-timeouts`: Restart Gitea and Traefik to fix timeouts
|
||||
- `--tags fix-ssl`: Fix SSL/routing issues
|
||||
- `--tags fix-servers-transport`: Update ServersTransport configuration
|
||||
- `--tags complete`: Complete fix (stop runner, restart services, verify)
|
||||
- **`check-and-restart-gitea.yml`** - Check and Restart Gitea if Unhealthy (Wrapper für `gitea` Role, `tasks_from: restart`)
|
||||
- **`fix-gitea-runner-config.yml`** - Fix Gitea Runner Configuration (Wrapper für `gitea` Role, `tasks_from: runner` mit `gitea_runner_action: fix`)
|
||||
- **`register-gitea-runner.yml`** - Register Gitea Runner (Wrapper für `gitea` Role, `tasks_from: runner` mit `gitea_runner_action: register`)
|
||||
- **`update-gitea-config.yml`** - Update Gitea Configuration (Wrapper für `gitea` Role, `tasks_from: config`)
|
||||
- **`setup-gitea-initial-config.yml`** - Setup Gitea Initial Configuration (Wrapper für `gitea` Role, `tasks_from: setup`)
|
||||
- **`setup-gitea-repository.yml`** - Setup Gitea Repository (Wrapper für `gitea` Role, `tasks_from: repository`)
|
||||
|
||||
### Application Deployment (Role-basiert)
|
||||
- **`deploy-application-code.yml`** - Deploy Application Code via Git (Wrapper für `application` Role, `tasks_from: deploy_code` mit `application_deployment_method: git`)
|
||||
#### Application Management
|
||||
- **`manage/application.yml`** - Consolidated application management (to be created)
|
||||
- **`sync-application-code.yml`** - Synchronize Application Code via Rsync (Wrapper für `application` Role, `tasks_from: deploy_code` mit `application_deployment_method: rsync`)
|
||||
- **`install-composer-dependencies.yml`** - Install Composer Dependencies (Wrapper für `application` Role, `tasks_from: composer`)
|
||||
|
||||
### Application Container Management (Role-basiert)
|
||||
- **`check-container-status.yml`** - Check Container Status (Wrapper für `application` Role, `tasks_from: health_check`)
|
||||
- **`check-container-logs.yml`** - Check Container Logs (Wrapper für `application` Role, `tasks_from: logs`)
|
||||
- **`check-worker-logs.yml`** - Check Worker and Scheduler Logs (Wrapper für `application` Role, `tasks_from: logs` mit `application_logs_check_vendor: true`)
|
||||
@@ -46,28 +85,89 @@
|
||||
- **`recreate-containers-with-env.yml`** - Recreate Containers with Environment Variables (Wrapper für `application` Role, `tasks_from: containers` mit `application_container_action: recreate-with-env`)
|
||||
- **`sync-and-recreate-containers.yml`** - Sync and Recreate Containers (Wrapper für `application` Role, `tasks_from: containers` mit `application_container_action: sync-recreate`)
|
||||
|
||||
### Diagnose (Konsolidiert)
|
||||
|
||||
#### Gitea Diagnose
|
||||
- **`diagnose/gitea.yml`** - Consolidated Gitea diagnosis
|
||||
- Basic checks (always): Container status, health endpoints, network connectivity, service discovery
|
||||
- `--tags deep`: Resource usage, multiple connection tests, log analysis
|
||||
- `--tags complete`: All checks including app.ini, ServersTransport, etc.
|
||||
|
||||
#### Traefik Diagnose
|
||||
- **`diagnose/traefik.yml`** - Consolidated Traefik diagnosis
|
||||
- Basic checks (always): Container status, restart count, recent logs
|
||||
- `--tags restart-source`: Find source of restart loops (cronjobs, systemd, scripts)
|
||||
- `--tags monitor`: Monitor for restarts over time
|
||||
|
||||
### Maintenance
|
||||
- **`cleanup-all-containers.yml`** - Stoppt und entfernt alle Container, bereinigt Netzwerke und Volumes (für vollständigen Server-Reset)
|
||||
- **`system-maintenance.yml`** - System-Updates, Unattended-Upgrades, Docker-Pruning
|
||||
- **`troubleshoot.yml`** - Unified Troubleshooting mit Tags
|
||||
|
||||
- **`maintenance/backup.yml`** - Erstellt Backups von PostgreSQL, Application Data, Gitea, Registry
|
||||
- **`maintenance/backup-before-redeploy.yml`** - Backup before redeploy (Gitea data, SSL certificates, configurations)
|
||||
- **`maintenance/rollback-redeploy.yml`** - Rollback from redeploy backup
|
||||
- **`maintenance/cleanup.yml`** - Stoppt und entfernt alle Container, bereinigt Netzwerke und Volumes (für vollständigen Server-Reset)
|
||||
- **`maintenance/system.yml`** - System-Updates, Unattended-Upgrades, Docker-Pruning
|
||||
- **`rollback.yml`** - Rollback zu vorheriger Version
|
||||
|
||||
### WireGuard
|
||||
|
||||
- **`generate-wireguard-client.yml`** - Generiert WireGuard Client-Config
|
||||
- **`wireguard-routing.yml`** - Konfiguriert WireGuard Routing
|
||||
- **`setup-wireguard-host.yml`** - WireGuard VPN Setup
|
||||
|
||||
### Initial Deployment
|
||||
|
||||
- **`build-initial-image.yml`** - Build und Push des initialen Docker Images (für erstes Deployment)
|
||||
|
||||
### CI/CD & Development
|
||||
|
||||
- **`setup-gitea-runner-ci.yml`** - Gitea Runner CI Setup
|
||||
- **`install-docker.yml`** - Docker Installation auf Server
|
||||
|
||||
## Entfernte/Legacy Playbooks
|
||||
## Entfernte/Konsolidierte Playbooks
|
||||
|
||||
Die folgenden Playbooks wurden entfernt, da sie nicht mehr benötigt werden:
|
||||
- ~~`build-and-push.yml`~~ - Wird durch CI/CD Pipeline ersetzt
|
||||
- ~~`remove-framework-production-stack.yml`~~ - Temporäres Playbook
|
||||
- ~~`remove-temporary-grafana-ip.yml`~~ - Temporäres Playbook
|
||||
Die folgenden Playbooks wurden konsolidiert oder entfernt:
|
||||
|
||||
### Konsolidiert in `diagnose/gitea.yml`:
|
||||
- ~~`diagnose-gitea-timeouts.yml`~~
|
||||
- ~~`diagnose-gitea-timeout-deep.yml`~~
|
||||
- ~~`diagnose-gitea-timeout-live.yml`~~
|
||||
- ~~`diagnose-gitea-timeouts-complete.yml`~~
|
||||
- ~~`comprehensive-gitea-diagnosis.yml`~~
|
||||
|
||||
### Konsolidiert in `manage/gitea.yml`:
|
||||
- ~~`fix-gitea-timeouts.yml`~~
|
||||
- ~~`fix-gitea-traefik-connection.yml`~~
|
||||
- ~~`fix-gitea-ssl-routing.yml`~~
|
||||
- ~~`fix-gitea-servers-transport.yml`~~
|
||||
- ~~`fix-gitea-complete.yml`~~
|
||||
- ~~`restart-gitea-complete.yml`~~
|
||||
- ~~`restart-gitea-with-cache.yml`~~
|
||||
|
||||
### Konsolidiert in `diagnose/traefik.yml`:
|
||||
- ~~`diagnose-traefik-restarts.yml`~~
|
||||
- ~~`find-traefik-restart-source.yml`~~
|
||||
- ~~`monitor-traefik-restarts.yml`~~
|
||||
- ~~`monitor-traefik-continuously.yml`~~
|
||||
- ~~`verify-traefik-fix.yml`~~
|
||||
|
||||
### Konsolidiert in `manage/traefik.yml`:
|
||||
- ~~`stabilize-traefik.yml`~~
|
||||
- ~~`disable-traefik-auto-restarts.yml`~~
|
||||
|
||||
### Entfernt (veraltet/redundant):
|
||||
- ~~`update-gitea-traefik-service.yml`~~ - Deprecated (wie in Code dokumentiert)
|
||||
- ~~`ensure-gitea-traefik-discovery.yml`~~ - Redundant
|
||||
- ~~`test-gitea-after-fix.yml`~~ - Temporär
|
||||
- ~~`find-ansible-automation-source.yml`~~ - Temporär
|
||||
|
||||
### Verschoben:
|
||||
- `setup-infrastructure.yml` → `setup/infrastructure.yml`
|
||||
- `deploy-complete.yml` → `deploy/complete.yml`
|
||||
- `deploy-image.yml` → `deploy/image.yml`
|
||||
- `deploy-application-code.yml` → `deploy/code.yml`
|
||||
- `setup-ssl-certificates.yml` → `setup/ssl.yml`
|
||||
- `setup-gitea-initial-config.yml` → `setup/gitea.yml`
|
||||
- `cleanup-all-containers.yml` → `maintenance/cleanup.yml`
|
||||
|
||||
## Verwendung
|
||||
|
||||
@@ -78,6 +178,69 @@ cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml playbooks/<playbook>.yml --vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
### Konsolidierte Playbooks mit Tags
|
||||
|
||||
**Gitea Diagnose:**
|
||||
```bash
|
||||
# Basic diagnosis (default)
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Deep diagnosis
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags deep --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Complete diagnosis
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags complete --vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**Gitea Management:**
|
||||
```bash
|
||||
# Restart Gitea
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags restart --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Fix timeouts
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags fix-timeouts --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Complete fix
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags complete --vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**Traefik Diagnose:**
|
||||
```bash
|
||||
# Basic diagnosis
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Find restart source
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags restart-source --vault-password-file secrets/.vault_pass
|
||||
|
||||
# Monitor restarts
|
||||
ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags monitor --vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**Traefik Management:**
|
||||
```bash
|
||||
# Stabilize Traefik
|
||||
ansible-playbook -i inventory/production.yml playbooks/manage/traefik.yml --tags stabilize --vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**Redeploy:**
|
||||
```bash
|
||||
# With automatic backup
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml --vault-password-file secrets/.vault_pass
|
||||
|
||||
# With existing backup
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890" \
|
||||
-e "skip_backup=true"
|
||||
```
|
||||
|
||||
**Rollback:**
|
||||
```bash
|
||||
ansible-playbook -i inventory/production.yml playbooks/maintenance/rollback-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890"
|
||||
```
|
||||
|
||||
### Role-basierte Playbooks
|
||||
|
||||
Die meisten Playbooks sind jetzt Wrapper, die Roles verwenden. Die Funktionalität bleibt gleich, aber die Implementierung ist jetzt in wiederverwendbaren Roles organisiert:
|
||||
@@ -99,7 +262,7 @@ ansible-playbook -i inventory/production.yml playbooks/fix-gitea-runner-config.y
|
||||
**Beispiel: Application Code Deployment**
|
||||
```bash
|
||||
# Git-basiert (Standard):
|
||||
ansible-playbook -i inventory/production.yml playbooks/deploy-application-code.yml \
|
||||
ansible-playbook -i inventory/production.yml playbooks/deploy/code.yml \
|
||||
-e "deployment_environment=staging" \
|
||||
-e "git_branch=staging" \
|
||||
--vault-password-file secrets/.vault_pass
|
||||
@@ -109,21 +272,6 @@ ansible-playbook -i inventory/production.yml playbooks/sync-application-code.yml
|
||||
--vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
### Tags verwenden
|
||||
|
||||
Viele Playbooks unterstützen Tags für selektive Ausführung:
|
||||
|
||||
```bash
|
||||
# Nur Traefik-bezogene Tasks:
|
||||
ansible-playbook -i inventory/production.yml playbooks/restart-traefik.yml --tags traefik,restart
|
||||
|
||||
# Nur Gitea-bezogene Tasks:
|
||||
ansible-playbook -i inventory/production.yml playbooks/check-and-restart-gitea.yml --tags gitea,restart
|
||||
|
||||
# Nur Application-bezogene Tasks:
|
||||
ansible-playbook -i inventory/production.yml playbooks/deploy-application-code.yml --tags application,deploy
|
||||
```
|
||||
|
||||
## Role-Struktur
|
||||
|
||||
Die Playbooks verwenden jetzt folgende Roles:
|
||||
@@ -143,11 +291,11 @@ Die Playbooks verwenden jetzt folgende Roles:
|
||||
- **Location**: `roles/application/tasks/`
|
||||
- **Defaults**: `roles/application/defaults/main.yml`
|
||||
|
||||
## Vorteile der Role-basierten Struktur
|
||||
|
||||
1. **Wiederverwendbarkeit**: Tasks können in mehreren Playbooks genutzt werden
|
||||
2. **Wartbarkeit**: Änderungen zentral in Roles
|
||||
3. **Testbarkeit**: Roles isoliert testbar
|
||||
4. **Klarheit**: Klare Struktur nach Komponenten
|
||||
5. **Best Practices**: Folgt Ansible-Empfehlungen
|
||||
## Vorteile der neuen Struktur
|
||||
|
||||
1. **Klarheit**: Klare Verzeichnisstruktur nach Funktion
|
||||
2. **Konsolidierung**: Redundante Playbooks zusammengeführt
|
||||
3. **Tags**: Selektive Ausführung mit Tags
|
||||
4. **Wiederverwendbarkeit**: Tasks können in mehreren Playbooks genutzt werden
|
||||
5. **Wartbarkeit**: Änderungen zentral in Roles
|
||||
6. **Best Practices**: Folgt Ansible-Empfehlungen
|
||||
|
||||
@@ -1,195 +0,0 @@
|
||||
---
|
||||
# Comprehensive Gitea Timeout Diagnosis
|
||||
# Prüft alle Aspekte des intermittierenden Gitea-Timeout-Problems
|
||||
- name: Comprehensive Gitea Timeout Diagnosis
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Check Traefik container uptime and restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: traefik_info
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea container uptime and restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect gitea --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: gitea_info
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik logs for recent restarts (last 2 hours)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik --since 2h 2>&1 | grep -iE "stopping server gracefully|I have to go|restart|shutdown" | tail -20 || echo "Keine Restart-Meldungen in den letzten 2 Stunden"
|
||||
register: traefik_restart_logs
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea logs for errors/timeouts (last 2 hours)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose logs gitea --since 2h 2>&1 | grep -iE "error|timeout|failed|panic|fatal|slow" | tail -30 || echo "Keine Fehler in den letzten 2 Stunden"
|
||||
register: gitea_error_logs
|
||||
changed_when: false
|
||||
|
||||
- name: Test Gitea direct connection (multiple attempts)
|
||||
ansible.builtin.shell: |
|
||||
for i in {1..5}; do
|
||||
echo "=== Attempt $i ==="
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout 5 docker compose exec -T gitea curl -f http://localhost:3000/api/healthz 2>&1 || echo "FAILED"
|
||||
sleep 1
|
||||
done
|
||||
register: gitea_direct_tests
|
||||
changed_when: false
|
||||
|
||||
- name: Test Gitea via Traefik (multiple attempts)
|
||||
ansible.builtin.shell: |
|
||||
for i in {1..5}; do
|
||||
echo "=== Attempt $i ==="
|
||||
timeout 10 curl -k -s -o /dev/null -w "%{http_code}" {{ gitea_url }}/api/healthz 2>&1 || echo "TIMEOUT"
|
||||
sleep 2
|
||||
done
|
||||
register: gitea_traefik_tests
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik service discovery for Gitea (using CLI)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik traefik show providers docker 2>/dev/null | grep -i "gitea" || echo "Gitea service not found in Traefik providers"
|
||||
register: traefik_gitea_service
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik routers for Gitea (using CLI)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik traefik show providers docker 2>/dev/null | grep -i "gitea" || echo "Gitea router not found in Traefik providers"
|
||||
register: traefik_gitea_router
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check network connectivity Traefik -> Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
for i in {1..3}; do
|
||||
echo "=== Attempt $i ==="
|
||||
docker compose exec -T traefik wget -qO- --timeout=5 http://gitea:3000/api/healthz 2>&1 || echo "CONNECTION_FAILED"
|
||||
sleep 1
|
||||
done
|
||||
register: traefik_gitea_network
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea container resources (CPU/Memory)
|
||||
ansible.builtin.shell: |
|
||||
docker stats gitea --no-stream --format 'CPU: {{ '{{' }}.CPUPerc{{ '}}' }} | Memory: {{ '{{' }}.MemUsage{{ '}}' }}' 2>/dev/null || echo "Could not get stats"
|
||||
register: gitea_resources
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik container resources (CPU/Memory)
|
||||
ansible.builtin.shell: |
|
||||
docker stats traefik --no-stream --format 'CPU: {{ '{{' }}.CPUPerc{{ '}}' }} | Memory: {{ '{{' }}.MemUsage{{ '}}' }}' 2>/dev/null || echo "Could not get stats"
|
||||
register: traefik_resources
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check if Gitea is in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker network inspect traefik-public --format '{{ '{{' }}range .Containers{{ '}}' }}{{ '{{' }}.Name{{ '}}' }} {{ '{{' }}end{{ '}}' }}' 2>/dev/null | grep -q gitea && echo "YES" || echo "NO"
|
||||
register: gitea_in_network
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik access logs for Gitea requests (last 100 lines)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
tail -100 logs/access.log 2>/dev/null | grep -i "git.michaelschiemer.de" | tail -20 || echo "Keine Access-Logs gefunden"
|
||||
register: traefik_access_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik error logs for Gitea-related errors
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
tail -100 logs/traefik.log 2>/dev/null | grep -iE "gitea|git\.michaelschiemer\.de|timeout|error.*gitea" | tail -20 || echo "Keine Gitea-Fehler in Traefik-Logs"
|
||||
register: traefik_error_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
UMFASSENDE GITEA TIMEOUT DIAGNOSE:
|
||||
================================================================================
|
||||
|
||||
Container Status:
|
||||
- Traefik: {{ traefik_info.stdout }}
|
||||
- Gitea: {{ gitea_info.stdout }}
|
||||
|
||||
Traefik Restart-Logs (letzte 2h):
|
||||
{{ traefik_restart_logs.stdout }}
|
||||
|
||||
Gitea Error-Logs (letzte 2h):
|
||||
{{ gitea_error_logs.stdout }}
|
||||
|
||||
Direkte Gitea-Verbindung (5 Versuche):
|
||||
{{ gitea_direct_tests.stdout }}
|
||||
|
||||
Gitea via Traefik (5 Versuche):
|
||||
{{ gitea_traefik_tests.stdout }}
|
||||
|
||||
Traefik Service Discovery:
|
||||
- Gitea Service: {{ traefik_gitea_service.stdout }}
|
||||
- Gitea Router: {{ traefik_gitea_router.stdout }}
|
||||
|
||||
Netzwerk-Verbindung Traefik -> Gitea (3 Versuche):
|
||||
{{ traefik_gitea_network.stdout }}
|
||||
|
||||
Container-Ressourcen:
|
||||
- Gitea: {{ gitea_resources.stdout }}
|
||||
- Traefik: {{ traefik_resources.stdout }}
|
||||
|
||||
Netzwerk:
|
||||
- Gitea in traefik-public: {% if gitea_in_network.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
|
||||
Traefik Access-Logs (letzte 20 Gitea-Requests):
|
||||
{{ traefik_access_logs.stdout }}
|
||||
|
||||
Traefik Error-Logs (Gitea-bezogen):
|
||||
{{ traefik_error_logs.stdout }}
|
||||
|
||||
================================================================================
|
||||
ANALYSE:
|
||||
================================================================================
|
||||
|
||||
{% if 'stopping server gracefully' in traefik_restart_logs.stdout | lower or 'I have to go' in traefik_restart_logs.stdout %}
|
||||
❌ PROBLEM: Traefik wird regelmäßig gestoppt!
|
||||
→ Dies ist die Hauptursache für die Timeouts
|
||||
→ Führe 'find-traefik-restart-source.yml' aus um die Quelle zu finden
|
||||
{% endif %}
|
||||
|
||||
{% if 'CONNECTION_FAILED' in traefik_gitea_network.stdout %}
|
||||
❌ PROBLEM: Traefik kann Gitea nicht erreichen
|
||||
→ Netzwerk-Problem zwischen Traefik und Gitea
|
||||
→ Prüfe ob beide Container im traefik-public Netzwerk sind
|
||||
{% endif %}
|
||||
|
||||
{% if 'not found' in traefik_gitea_service.stdout | lower or 'not found' in traefik_gitea_router.stdout | lower %}
|
||||
❌ PROBLEM: Gitea nicht in Traefik Service Discovery
|
||||
→ Traefik hat Gitea nicht erkannt
|
||||
→ Führe 'fix-gitea-timeouts.yml' aus um beide zu restarten
|
||||
{% endif %}
|
||||
|
||||
{% if 'TIMEOUT' in gitea_traefik_tests.stdout %}
|
||||
⚠️ PROBLEM: Intermittierende Timeouts via Traefik
|
||||
→ Mögliche Ursachen: Traefik-Restarts, Gitea-Performance, Netzwerk-Probleme
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,499 +0,0 @@
|
||||
---
|
||||
# Diagnose Gitea Timeout - Deep Analysis während Request
|
||||
# Führt alle Checks während eines tatsächlichen Requests durch, inkl. pg_stat_activity, Redis, Backpressure-Tests
|
||||
- name: Diagnose Gitea Timeout Deep Analysis During Request
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
test_duration_seconds: 60 # Wie lange wir testen
|
||||
test_timestamp: "{{ ansible_date_time.epoch }}"
|
||||
postgres_max_connections: 300
|
||||
|
||||
tasks:
|
||||
- name: Display diagnostic plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA TIMEOUT DEEP DIAGNOSE - LIVE WÄHREND REQUEST
|
||||
================================================================================
|
||||
|
||||
Diese erweiterte Diagnose führt alle Checks während eines tatsächlichen Requests durch:
|
||||
|
||||
1. Docker Stats (CPU/RAM/IO) während Request
|
||||
2. pg_stat_activity: Connection Count vs max_connections ({{ postgres_max_connections }})
|
||||
3. Redis Ping Check (Session-Store-Blockaden)
|
||||
4. Gitea localhost Test (Backpressure-Analyse)
|
||||
5. Gitea Logs (DB-Timeouts, Panics, "context deadline exceeded", SESSION: context canceled)
|
||||
6. Postgres Logs (Connection issues, authentication timeouts)
|
||||
7. Traefik Logs ("backend connection error", "EOF")
|
||||
8. Runner Status und git-upload-pack/git gc Jobs
|
||||
|
||||
Test-Dauer: {{ test_duration_seconds }} Sekunden
|
||||
Timestamp: {{ test_timestamp }}
|
||||
================================================================================
|
||||
|
||||
- name: Get initial container stats (baseline)
|
||||
ansible.builtin.shell: |
|
||||
docker stats --no-stream --format "table {{ '{{' }}.Name{{ '}}' }}\t{{ '{{' }}.CPUPerc{{ '}}' }}\t{{ '{{' }}.MemUsage{{ '}}' }}\t{{ '{{' }}.NetIO{{ '}}' }}\t{{ '{{' }}.BlockIO{{ '}}' }}" gitea gitea-postgres gitea-redis traefik 2>/dev/null || echo "Stats collection failed"
|
||||
register: initial_stats
|
||||
changed_when: false
|
||||
|
||||
- name: Get initial PostgreSQL connection count
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T postgres psql -U gitea -d gitea -c "SELECT count(*) as connection_count FROM pg_stat_activity;" 2>&1 | grep -E "^[[:space:]]*[0-9]+" | head -1 || echo "0"
|
||||
register: initial_pg_connections
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Start collecting Docker stats in background
|
||||
ansible.builtin.shell: |
|
||||
timeout {{ test_duration_seconds }} docker stats --format "{{ '{{' }}.Name{{ '}}' }},{{ '{{' }}.CPUPerc{{ '}}' }},{{ '{{' }}.MemUsage{{ '}}' }},{{ '{{' }}.NetIO{{ '}}' }},{{ '{{' }}.BlockIO{{ '}}' }}" gitea gitea-postgres gitea-redis traefik 2>/dev/null | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/gitea_stats_{{ test_timestamp }}.log 2>&1 &
|
||||
STATS_PID=$!
|
||||
echo $STATS_PID
|
||||
register: stats_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Gitea logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f gitea 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/gitea_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: gitea_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Postgres logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f postgres 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/postgres_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: postgres_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Traefik logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f traefik 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/traefik_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: traefik_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start monitoring pg_stat_activity in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
for i in $(seq 1 {{ (test_duration_seconds / 5) | int }}); do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $(docker compose exec -T postgres psql -U gitea -d gitea -t -c 'SELECT count(*) FROM pg_stat_activity;' 2>&1 | tr -d ' ' || echo 'ERROR')"
|
||||
sleep 5
|
||||
done > /tmp/pg_stat_activity_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: pg_stat_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Wait a moment for log collection to start
|
||||
ansible.builtin.pause:
|
||||
seconds: 2
|
||||
|
||||
- name: Trigger Gitea request via Traefik (with timeout)
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Starting request to {{ gitea_url }}/api/healthz"
|
||||
timeout 35 curl -k -v -s -o /tmp/gitea_response_{{ test_timestamp }}.log -w "\nHTTP_CODE:%{http_code}\nTIME_TOTAL:%{time_total}\nTIME_CONNECT:%{time_connect}\nTIME_STARTTRANSFER:%{time_starttransfer}\n" "{{ gitea_url }}/api/healthz" 2>&1 | tee /tmp/gitea_curl_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Request completed"
|
||||
register: gitea_request
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test Gitea localhost (Backpressure-Test)
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Starting localhost test"
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout 35 docker compose exec -T gitea curl -f -s -w "\nHTTP_CODE:%{http_code}\nTIME_TOTAL:%{time_total}\n" http://localhost:3000/api/healthz 2>&1 | tee /tmp/gitea_localhost_{{ test_timestamp }}.log || echo "LOCALHOST_TEST_FAILED" > /tmp/gitea_localhost_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Localhost test completed"
|
||||
register: gitea_localhost_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test direct connection Traefik → Gitea (parallel)
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Starting direct test Traefik → Gitea"
|
||||
cd {{ traefik_stack_path }}
|
||||
timeout 35 docker compose exec -T traefik wget -qO- --timeout=30 http://gitea:3000/api/healthz 2>&1 | tee /tmp/traefik_gitea_direct_{{ test_timestamp }}.log || echo "DIRECT_TEST_FAILED" > /tmp/traefik_gitea_direct_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Direct test completed"
|
||||
register: traefik_direct_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test Redis connection during request
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Testing Redis connection"
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T redis redis-cli ping 2>&1 | tee /tmp/redis_ping_{{ test_timestamp }}.log || echo "REDIS_PING_FAILED" > /tmp/redis_ping_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Redis ping completed"
|
||||
register: redis_ping_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Gitea Runner status
|
||||
ansible.builtin.shell: |
|
||||
docker ps --format "{{ '{{' }}.Names{{ '}}' }}" | grep -q "gitea-runner" && echo "RUNNING" || echo "STOPPED"
|
||||
register: runner_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Wait for log collection to complete
|
||||
ansible.builtin.pause:
|
||||
seconds: "{{ test_duration_seconds - 5 }}"
|
||||
|
||||
- name: Stop background processes
|
||||
ansible.builtin.shell: |
|
||||
pkill -f "docker.*stats.*gitea" || true
|
||||
pkill -f "docker compose logs.*gitea" || true
|
||||
pkill -f "docker compose logs.*postgres" || true
|
||||
pkill -f "docker compose logs.*traefik" || true
|
||||
pkill -f "pg_stat_activity" || true
|
||||
sleep 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get final PostgreSQL connection count
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T postgres psql -U gitea -d gitea -c "SELECT count(*) as connection_count FROM pg_stat_activity;" 2>&1 | grep -E "^[[:space:]]*[0-9]+" | head -1 || echo "0"
|
||||
register: final_pg_connections
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect stats results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_stats_{{ test_timestamp }}.log"
|
||||
register: stats_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect pg_stat_activity results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/pg_stat_activity_{{ test_timestamp }}.log"
|
||||
register: pg_stat_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Gitea logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_logs_{{ test_timestamp }}.log"
|
||||
register: gitea_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Postgres logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/postgres_logs_{{ test_timestamp }}.log"
|
||||
register: postgres_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Traefik logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/traefik_logs_{{ test_timestamp }}.log"
|
||||
register: traefik_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get request result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_curl_{{ test_timestamp }}.log"
|
||||
register: request_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get localhost test result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_localhost_{{ test_timestamp }}.log"
|
||||
register: localhost_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get direct test result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/traefik_gitea_direct_{{ test_timestamp }}.log"
|
||||
register: direct_test_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get Redis ping result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/redis_ping_{{ test_timestamp }}.log"
|
||||
register: redis_ping_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze pg_stat_activity for connection count
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/pg_stat_activity_{{ test_timestamp }}.log ]; then
|
||||
echo "=== POSTGRES CONNECTION COUNT ANALYSIS ==="
|
||||
echo "Initial connections: {{ initial_pg_connections.stdout }}"
|
||||
echo "Final connections: {{ final_pg_connections.stdout }}"
|
||||
echo "Max connections: {{ postgres_max_connections }}"
|
||||
echo ""
|
||||
echo "=== CONNECTION COUNT TIMELINE ==="
|
||||
cat /tmp/pg_stat_activity_{{ test_timestamp }}.log | tail -20 || echo "No connection count data"
|
||||
echo ""
|
||||
echo "=== CONNECTION COUNT ANALYSIS ==="
|
||||
MAX_COUNT=$(cat /tmp/pg_stat_activity_{{ test_timestamp }}.log | grep -E "^\[.*\] [0-9]+" | awk -F'] ' '{print $2}' | sort -n | tail -1 || echo "0")
|
||||
if [ "$MAX_COUNT" != "0" ] && [ "$MAX_COUNT" != "" ]; then
|
||||
echo "Maximum connections during test: $MAX_COUNT"
|
||||
WARNING_THRESHOLD=$(({{ postgres_max_connections }} * 80 / 100))
|
||||
if [ "$MAX_COUNT" -gt "$WARNING_THRESHOLD" ]; then
|
||||
echo "⚠️ WARNING: Connection count ($MAX_COUNT) is above 80% of max_connections ({{ postgres_max_connections }})"
|
||||
echo " Consider reducing MAX_OPEN_CONNS or increasing max_connections"
|
||||
else
|
||||
echo "✅ Connection count is within safe limits"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
echo "pg_stat_activity log file not found"
|
||||
fi
|
||||
register: pg_stat_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze stats for high CPU/Memory/IO
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/gitea_stats_{{ test_timestamp }}.log ]; then
|
||||
echo "=== STATS SUMMARY ==="
|
||||
echo "Total samples: $(wc -l < /tmp/gitea_stats_{{ test_timestamp }}.log)"
|
||||
echo ""
|
||||
echo "=== HIGH CPU (>80%) ==="
|
||||
grep -E "gitea|gitea-postgres" /tmp/gitea_stats_{{ test_timestamp }}.log | awk -F',' '{cpu=$2; gsub(/%/, "", cpu); if (cpu+0 > 80) print $0}' | head -10 || echo "No high CPU usage found"
|
||||
echo ""
|
||||
echo "=== MEMORY USAGE ==="
|
||||
grep -E "gitea" /tmp/gitea_stats_{{ test_timestamp }}.log | tail -5 || echo "No memory stats"
|
||||
else
|
||||
echo "Stats file not found"
|
||||
fi
|
||||
register: stats_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Gitea logs for errors (including SESSION context canceled, panic, git-upload-pack)
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/gitea_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== DB-TIMEOUTS / CONNECTION ERRORS ==="
|
||||
grep -iE "timeout|deadline exceeded|connection.*failed|database.*error|postgres.*error|context.*deadline" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -20 || echo "No DB-timeouts found"
|
||||
echo ""
|
||||
echo "=== SESSION: CONTEXT CANCELED ==="
|
||||
grep -iE "SESSION.*context canceled|session.*release.*context canceled" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No SESSION: context canceled found"
|
||||
echo ""
|
||||
echo "=== PANICS / FATAL ERRORS ==="
|
||||
grep -iE "panic|fatal|error.*fatal" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No panics found"
|
||||
echo ""
|
||||
echo "=== GIT-UPLOAD-PACK REQUESTS (can block) ==="
|
||||
grep -iE "git-upload-pack|ServiceUploadPack" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No git-upload-pack requests found"
|
||||
echo ""
|
||||
echo "=== GIT GC JOBS (can hold connections) ==="
|
||||
grep -iE "git.*gc|garbage.*collect" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No git gc jobs found"
|
||||
echo ""
|
||||
echo "=== SLOW QUERIES / PERFORMANCE ==="
|
||||
grep -iE "slow|performance|took.*ms|duration" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No slow queries found"
|
||||
else
|
||||
echo "Gitea logs file not found"
|
||||
fi
|
||||
register: gitea_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Postgres logs for errors
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/postgres_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== POSTGRES ERRORS ==="
|
||||
grep -iE "error|timeout|deadlock|connection.*refused|too many connections|authentication.*timeout" /tmp/postgres_logs_{{ test_timestamp }}.log | tail -20 || echo "No Postgres errors found"
|
||||
echo ""
|
||||
echo "=== SLOW QUERIES ==="
|
||||
grep -iE "slow|duration|statement.*took" /tmp/postgres_logs_{{ test_timestamp }}.log | tail -10 || echo "No slow queries found"
|
||||
else
|
||||
echo "Postgres logs file not found"
|
||||
fi
|
||||
register: postgres_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Traefik logs for backend errors
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/traefik_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== BACKEND CONNECTION ERRORS ==="
|
||||
grep -iE "backend.*error|connection.*error|EOF|gitea.*error|git\.michaelschiemer\.de.*error" /tmp/traefik_logs_{{ test_timestamp }}.log | tail -20 || echo "No backend errors found"
|
||||
echo ""
|
||||
echo "=== TIMEOUT ERRORS ==="
|
||||
grep -iE "timeout|504|gateway.*timeout" /tmp/traefik_logs_{{ test_timestamp }}.log | tail -10 || echo "No timeout errors found"
|
||||
else
|
||||
echo "Traefik logs file not found"
|
||||
fi
|
||||
register: traefik_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display comprehensive diagnosis
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA TIMEOUT DEEP DIAGNOSE - ERGEBNISSE
|
||||
================================================================================
|
||||
|
||||
BASELINE STATS (vor Request):
|
||||
{{ initial_stats.stdout }}
|
||||
|
||||
POSTGRES CONNECTION COUNT:
|
||||
{{ pg_stat_analysis.stdout }}
|
||||
|
||||
REQUEST ERGEBNIS (Traefik → Gitea):
|
||||
{% if request_result.content is defined and request_result.content != '' %}
|
||||
{{ request_result.content | b64decode }}
|
||||
{% else %}
|
||||
Request-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
BACKPRESSURE TEST - GITEA LOCALHOST:
|
||||
{% if localhost_result.content is defined and localhost_result.content != '' %}
|
||||
{{ localhost_result.content | b64decode }}
|
||||
{% else %}
|
||||
Localhost-Test-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
DIREKTER TEST TRAEFIK → GITEA:
|
||||
{% if direct_test_result.content is defined and direct_test_result.content != '' %}
|
||||
{{ direct_test_result.content | b64decode }}
|
||||
{% else %}
|
||||
Direkter Test-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
REDIS PING TEST:
|
||||
{% if redis_ping_result.content is defined and redis_ping_result.content != '' %}
|
||||
{{ redis_ping_result.content | b64decode }}
|
||||
{% else %}
|
||||
Redis-Ping-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
RUNNER STATUS:
|
||||
- Status: {{ runner_status.stdout }}
|
||||
|
||||
================================================================================
|
||||
STATS-ANALYSE (während Request):
|
||||
================================================================================
|
||||
{{ stats_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
GITEA LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ gitea_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
POSTGRES LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ postgres_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
TRAEFIK LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ traefik_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
INTERPRETATION:
|
||||
================================================================================
|
||||
|
||||
{% set request_content = request_result.content | default('') | b64decode | default('') %}
|
||||
{% set localhost_content = localhost_result.content | default('') | b64decode | default('') %}
|
||||
{% set direct_content = direct_test_result.content | default('') | b64decode | default('') %}
|
||||
{% set redis_content = redis_ping_result.content | default('') | b64decode | default('') %}
|
||||
{% set traefik_errors = traefik_logs_analysis.stdout | default('') %}
|
||||
{% set gitea_errors = gitea_logs_analysis.stdout | default('') %}
|
||||
{% set postgres_errors = postgres_logs_analysis.stdout | default('') %}
|
||||
{% set stats_content = stats_analysis.stdout | default('') %}
|
||||
|
||||
{% if 'timeout' in request_content or '504' in request_content or 'HTTP_CODE:504' in request_content %}
|
||||
⚠️ REQUEST HAT TIMEOUT/504:
|
||||
|
||||
BACKPRESSURE-ANALYSE:
|
||||
{% if 'LOCALHOST_TEST_FAILED' in localhost_content or localhost_content == '' %}
|
||||
→ Gitea localhost Test schlägt fehl oder blockiert
|
||||
→ Problem liegt IN Gitea/DB selbst, nicht zwischen Traefik und Gitea
|
||||
{% elif 'HTTP_CODE:200' in localhost_content or '200 OK' in localhost_content %}
|
||||
→ Gitea localhost Test funktioniert schnell
|
||||
→ Problem liegt ZWISCHEN Traefik und Gitea (Netzwerk, Firewall, Limit)
|
||||
{% endif %}
|
||||
|
||||
{% if 'REDIS_PING_FAILED' in redis_content or redis_content == '' or 'PONG' not in redis_content %}
|
||||
→ Redis ist nicht erreichbar
|
||||
→ Session-Store blockiert, Gitea läuft in "context canceled"
|
||||
{% else %}
|
||||
→ Redis ist erreichbar
|
||||
{% endif %}
|
||||
|
||||
{% if 'SESSION.*context canceled' in gitea_errors or 'session.*release.*context canceled' in gitea_errors %}
|
||||
→ Gitea hat SESSION: context canceled Fehler
|
||||
→ Session-Store (Redis) könnte blockieren oder Session-Locks hängen
|
||||
{% endif %}
|
||||
|
||||
{% if 'git-upload-pack' in gitea_errors %}
|
||||
→ git-upload-pack Requests gefunden (können blockieren)
|
||||
→ Prüfe ob Runner aktiv ist und viele Git-Operationen durchführt
|
||||
{% endif %}
|
||||
|
||||
{% if 'git.*gc' in gitea_errors %}
|
||||
→ git gc Jobs gefunden (können Verbindungen halten)
|
||||
→ Prüfe ob git gc Jobs hängen
|
||||
{% endif %}
|
||||
|
||||
{% if 'EOF' in traefik_errors or 'backend' in traefik_errors | lower or 'connection.*error' in traefik_errors | lower %}
|
||||
→ Traefik meldet Backend-Connection-Error
|
||||
→ Gitea antwortet nicht auf Traefik's Verbindungsversuche
|
||||
{% endif %}
|
||||
|
||||
{% if 'timeout' in gitea_errors | lower or 'deadline exceeded' in gitea_errors | lower %}
|
||||
→ Gitea hat DB-Timeouts oder Context-Deadline-Exceeded
|
||||
→ Postgres könnte blockieren oder zu langsam sein
|
||||
{% endif %}
|
||||
|
||||
{% if 'too many connections' in postgres_errors | lower %}
|
||||
→ Postgres hat zu viele Verbindungen
|
||||
→ Connection Pool könnte überlastet sein
|
||||
{% endif %}
|
||||
|
||||
{% if 'HIGH CPU' in stats_content or '>80' in stats_content %}
|
||||
→ Gitea oder Postgres haben hohe CPU-Last
|
||||
→ Performance-Problem, nicht Timeout-Konfiguration
|
||||
{% endif %}
|
||||
{% else %}
|
||||
✅ REQUEST WAR ERFOLGREICH:
|
||||
→ Problem tritt nur intermittierend auf
|
||||
→ Prüfe Logs auf sporadische Fehler
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
NÄCHSTE SCHRITTE:
|
||||
================================================================================
|
||||
|
||||
1. Prüfe pg_stat_activity: Connection Count nahe max_connections?
|
||||
2. Prüfe ob Redis erreichbar ist (Session-Store-Blockaden)
|
||||
3. Prüfe Backpressure: localhost schnell aber Traefik langsam = Netzwerk-Problem
|
||||
4. Prüfe SESSION: context canceled Fehler (Session-Locks)
|
||||
5. Prüfe git-upload-pack Requests (Runner-Überlastung)
|
||||
6. Prüfe git gc Jobs (hängen und halten Verbindungen)
|
||||
|
||||
================================================================================
|
||||
|
||||
- name: Cleanup temporary files
|
||||
ansible.builtin.file:
|
||||
path: "/tmp/gitea_{{ test_timestamp }}.log"
|
||||
state: absent
|
||||
failed_when: false
|
||||
|
||||
@@ -1,343 +0,0 @@
|
||||
---
|
||||
# Diagnose Gitea Timeout - Live während Request
|
||||
# Führt alle Checks während eines tatsächlichen Requests durch
|
||||
- name: Diagnose Gitea Timeout During Request
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
test_duration_seconds: 60 # Wie lange wir testen
|
||||
test_timestamp: "{{ ansible_date_time.epoch }}"
|
||||
|
||||
tasks:
|
||||
- name: Display diagnostic plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA TIMEOUT DIAGNOSE - LIVE WÄHREND REQUEST
|
||||
================================================================================
|
||||
|
||||
Diese Diagnose führt alle Checks während eines tatsächlichen Requests durch:
|
||||
|
||||
1. Docker Stats (CPU/RAM/IO) während Request
|
||||
2. Gitea Logs (DB-Timeouts, Panics, "context deadline exceeded")
|
||||
3. Postgres Logs (Connection issues)
|
||||
4. Traefik Logs ("backend connection error", "EOF")
|
||||
5. Direkter Test Traefik → Gitea
|
||||
|
||||
Test-Dauer: {{ test_duration_seconds }} Sekunden
|
||||
Timestamp: {{ test_timestamp }}
|
||||
================================================================================
|
||||
|
||||
- name: Get initial container stats (baseline)
|
||||
ansible.builtin.shell: |
|
||||
docker stats --no-stream --format "table {{ '{{' }}.Name{{ '}}' }}\t{{ '{{' }}.CPUPerc{{ '}}' }}\t{{ '{{' }}.MemUsage{{ '}}' }}\t{{ '{{' }}.NetIO{{ '}}' }}\t{{ '{{' }}.BlockIO{{ '}}' }}" gitea gitea-postgres gitea-redis traefik 2>/dev/null || echo "Stats collection failed"
|
||||
register: initial_stats
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Docker stats in background
|
||||
ansible.builtin.shell: |
|
||||
timeout {{ test_duration_seconds }} docker stats --format "{{ '{{' }}.Name{{ '}}' }},{{ '{{' }}.CPUPerc{{ '}}' }},{{ '{{' }}.MemUsage{{ '}}' }},{{ '{{' }}.NetIO{{ '}}' }},{{ '{{' }}.BlockIO{{ '}}' }}" gitea gitea-postgres gitea-redis traefik 2>/dev/null | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/gitea_stats_{{ test_timestamp }}.log 2>&1 &
|
||||
STATS_PID=$!
|
||||
echo $STATS_PID
|
||||
register: stats_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Gitea logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f gitea 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/gitea_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: gitea_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Postgres logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f gitea-postgres 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/postgres_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: postgres_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Start collecting Traefik logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
timeout {{ test_duration_seconds }} docker compose logs -f traefik 2>&1 | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] $line"
|
||||
done > /tmp/traefik_logs_{{ test_timestamp }}.log 2>&1 &
|
||||
echo $!
|
||||
register: traefik_logs_pid
|
||||
changed_when: false
|
||||
|
||||
- name: Wait a moment for log collection to start
|
||||
ansible.builtin.pause:
|
||||
seconds: 2
|
||||
|
||||
- name: Trigger Gitea request via Traefik (with timeout)
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Starting request to {{ gitea_url }}/api/healthz"
|
||||
timeout 35 curl -k -v -s -o /tmp/gitea_response_{{ test_timestamp }}.log -w "\nHTTP_CODE:%{http_code}\nTIME_TOTAL:%{time_total}\nTIME_CONNECT:%{time_connect}\nTIME_STARTTRANSFER:%{time_starttransfer}\n" "{{ gitea_url }}/api/healthz" 2>&1 | tee /tmp/gitea_curl_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Request completed"
|
||||
register: gitea_request
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test direct connection Traefik → Gitea (parallel)
|
||||
ansible.builtin.shell: |
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Starting direct test Traefik → Gitea"
|
||||
cd {{ traefik_stack_path }}
|
||||
timeout 35 docker compose exec -T traefik wget -qO- --timeout=30 http://gitea:3000/api/healthz 2>&1 | tee /tmp/traefik_gitea_direct_{{ test_timestamp }}.log || echo "DIRECT_TEST_FAILED" > /tmp/traefik_gitea_direct_{{ test_timestamp }}.log
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S.%3N')] Direct test completed"
|
||||
register: traefik_direct_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Wait for log collection to complete
|
||||
ansible.builtin.pause:
|
||||
seconds: "{{ test_duration_seconds - 5 }}"
|
||||
|
||||
- name: Stop background processes
|
||||
ansible.builtin.shell: |
|
||||
pkill -f "docker.*stats.*gitea" || true
|
||||
pkill -f "docker compose logs.*gitea" || true
|
||||
pkill -f "docker compose logs.*postgres" || true
|
||||
pkill -f "docker compose logs.*traefik" || true
|
||||
sleep 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect stats results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_stats_{{ test_timestamp }}.log"
|
||||
register: stats_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Gitea logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_logs_{{ test_timestamp }}.log"
|
||||
register: gitea_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Postgres logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/postgres_logs_{{ test_timestamp }}.log"
|
||||
register: postgres_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Collect Traefik logs results
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/traefik_logs_{{ test_timestamp }}.log"
|
||||
register: traefik_logs_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get request result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/gitea_curl_{{ test_timestamp }}.log"
|
||||
register: request_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Get direct test result
|
||||
ansible.builtin.slurp:
|
||||
src: "/tmp/traefik_gitea_direct_{{ test_timestamp }}.log"
|
||||
register: direct_test_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze stats for high CPU/Memory/IO
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/gitea_stats_{{ test_timestamp }}.log ]; then
|
||||
echo "=== STATS SUMMARY ==="
|
||||
echo "Total samples: $(wc -l < /tmp/gitea_stats_{{ test_timestamp }}.log)"
|
||||
echo ""
|
||||
echo "=== HIGH CPU (>80%) ==="
|
||||
grep -E "gitea|gitea-postgres" /tmp/gitea_stats_{{ test_timestamp }}.log | awk -F',' '{cpu=$2; gsub(/%/, "", cpu); if (cpu+0 > 80) print $0}' | head -10 || echo "No high CPU usage found"
|
||||
echo ""
|
||||
echo "=== MEMORY USAGE ==="
|
||||
grep -E "gitea" /tmp/gitea_stats_{{ test_timestamp }}.log | tail -5 || echo "No memory stats"
|
||||
echo ""
|
||||
echo "=== NETWORK IO ==="
|
||||
grep -E "gitea" /tmp/gitea_stats_{{ test_timestamp }}.log | tail -5 || echo "No network activity"
|
||||
else
|
||||
echo "Stats file not found"
|
||||
fi
|
||||
register: stats_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Gitea logs for errors
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/gitea_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== DB-TIMEOUTS / CONNECTION ERRORS ==="
|
||||
grep -iE "timeout|deadline exceeded|connection.*failed|database.*error|postgres.*error|context.*deadline" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -20 || echo "No DB-timeouts found"
|
||||
echo ""
|
||||
echo "=== PANICS / FATAL ERRORS ==="
|
||||
grep -iE "panic|fatal|error.*fatal" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No panics found"
|
||||
echo ""
|
||||
echo "=== SLOW QUERIES / PERFORMANCE ==="
|
||||
grep -iE "slow|performance|took.*ms|duration" /tmp/gitea_logs_{{ test_timestamp }}.log | tail -10 || echo "No slow queries found"
|
||||
echo ""
|
||||
echo "=== RECENT LOG ENTRIES (last 10) ==="
|
||||
tail -10 /tmp/gitea_logs_{{ test_timestamp }}.log || echo "No recent logs"
|
||||
else
|
||||
echo "Gitea logs file not found"
|
||||
fi
|
||||
register: gitea_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Postgres logs for errors
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/postgres_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== POSTGRES ERRORS ==="
|
||||
grep -iE "error|timeout|deadlock|connection.*refused|too many connections" /tmp/postgres_logs_{{ test_timestamp }}.log | tail -20 || echo "No Postgres errors found"
|
||||
echo ""
|
||||
echo "=== SLOW QUERIES ==="
|
||||
grep -iE "slow|duration|statement.*took" /tmp/postgres_logs_{{ test_timestamp }}.log | tail -10 || echo "No slow queries found"
|
||||
echo ""
|
||||
echo "=== RECENT LOG ENTRIES (last 10) ==="
|
||||
tail -10 /tmp/postgres_logs_{{ test_timestamp }}.log || echo "No recent logs"
|
||||
else
|
||||
echo "Postgres logs file not found"
|
||||
fi
|
||||
register: postgres_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Analyze Traefik logs for backend errors
|
||||
ansible.builtin.shell: |
|
||||
if [ -f /tmp/traefik_logs_{{ test_timestamp }}.log ]; then
|
||||
echo "=== BACKEND CONNECTION ERRORS ==="
|
||||
grep -iE "backend.*error|connection.*error|EOF|gitea.*error|git\.michaelschiemer\.de.*error" /tmp/traefik_logs_{{ test_timestamp }}.log | tail -20 || echo "No backend errors found"
|
||||
echo ""
|
||||
echo "=== TIMEOUT ERRORS ==="
|
||||
grep -iE "timeout|504|gateway.*timeout" /tmp/traefik_logs_{{ test_timestamp }}.log | tail -10 || echo "No timeout errors found"
|
||||
echo ""
|
||||
echo "=== RECENT LOG ENTRIES (last 10) ==="
|
||||
tail -10 /tmp/traefik_logs_{{ test_timestamp }}.log || echo "No recent logs"
|
||||
else
|
||||
echo "Traefik logs file not found"
|
||||
fi
|
||||
register: traefik_logs_analysis
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display comprehensive diagnosis
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA TIMEOUT DIAGNOSE - ERGEBNISSE
|
||||
================================================================================
|
||||
|
||||
BASELINE STATS (vor Request):
|
||||
{{ initial_stats.stdout }}
|
||||
|
||||
REQUEST ERGEBNIS:
|
||||
{% if request_result.content is defined and request_result.content != '' %}
|
||||
{{ request_result.content | b64decode }}
|
||||
{% else %}
|
||||
Request-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
DIREKTER TEST TRAEFIK → GITEA:
|
||||
{% if direct_test_result.content is defined and direct_test_result.content != '' %}
|
||||
{{ direct_test_result.content | b64decode }}
|
||||
{% else %}
|
||||
Direkter Test-Ergebnis nicht verfügbar
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
STATS-ANALYSE (während Request):
|
||||
================================================================================
|
||||
{{ stats_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
GITEA LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ gitea_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
POSTGRES LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ postgres_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
TRAEFIK LOGS-ANALYSE:
|
||||
================================================================================
|
||||
{{ traefik_logs_analysis.stdout }}
|
||||
|
||||
================================================================================
|
||||
INTERPRETATION:
|
||||
================================================================================
|
||||
|
||||
{% set request_content = request_result.content | default('') | b64decode | default('') %}
|
||||
{% set direct_content = direct_test_result.content | default('') | b64decode | default('') %}
|
||||
{% set traefik_errors = traefik_logs_analysis.stdout | default('') %}
|
||||
{% set gitea_errors = gitea_logs_analysis.stdout | default('') %}
|
||||
{% set postgres_errors = postgres_logs_analysis.stdout | default('') %}
|
||||
{% set stats_content = stats_analysis.stdout | default('') %}
|
||||
|
||||
{% if 'timeout' in request_content or '504' in request_content or 'HTTP_CODE:504' in request_content %}
|
||||
⚠️ REQUEST HAT TIMEOUT/504:
|
||||
|
||||
{% if 'EOF' in traefik_errors or 'backend' in traefik_errors | lower or 'connection.*error' in traefik_errors | lower %}
|
||||
→ Traefik meldet Backend-Connection-Error
|
||||
→ Gitea antwortet nicht auf Traefik's Verbindungsversuche
|
||||
{% endif %}
|
||||
|
||||
{% if 'timeout' in gitea_errors | lower or 'deadline exceeded' in gitea_errors | lower %}
|
||||
→ Gitea hat DB-Timeouts oder Context-Deadline-Exceeded
|
||||
→ Postgres könnte blockieren oder zu langsam sein
|
||||
{% endif %}
|
||||
|
||||
{% if 'too many connections' in postgres_errors | lower %}
|
||||
→ Postgres hat zu viele Verbindungen
|
||||
→ Connection Pool könnte überlastet sein
|
||||
{% endif %}
|
||||
|
||||
{% if 'HIGH CPU' in stats_content or '>80' in stats_content %}
|
||||
→ Gitea oder Postgres haben hohe CPU-Last
|
||||
→ Performance-Problem, nicht Timeout-Konfiguration
|
||||
{% endif %}
|
||||
|
||||
{% if 'DIRECT_TEST_FAILED' in direct_content or direct_content == '' %}
|
||||
→ Direkter Test Traefik → Gitea schlägt fehl
|
||||
→ Problem liegt bei Gitea selbst, nicht bei Traefik-Routing
|
||||
{% endif %}
|
||||
{% else %}
|
||||
✅ REQUEST WAR ERFOLGREICH:
|
||||
→ Problem tritt nur intermittierend auf
|
||||
→ Prüfe Logs auf sporadische Fehler
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
NÄCHSTE SCHRITTE:
|
||||
================================================================================
|
||||
|
||||
1. Prüfe ob hohe CPU/Memory bei Gitea oder Postgres
|
||||
2. Prüfe ob DB-Timeouts in Gitea-Logs
|
||||
3. Prüfe ob Postgres "too many connections" meldet
|
||||
4. Prüfe ob Traefik "backend connection error" oder "EOF" meldet
|
||||
5. Prüfe ob direkter Test Traefik → Gitea funktioniert
|
||||
|
||||
================================================================================
|
||||
|
||||
- name: Cleanup temporary files
|
||||
ansible.builtin.file:
|
||||
path: "/tmp/gitea_{{ test_timestamp }}.log"
|
||||
state: absent
|
||||
failed_when: false
|
||||
|
||||
@@ -1,325 +0,0 @@
|
||||
---
|
||||
# Diagnose Gitea Timeouts
|
||||
# Prüft Gitea-Status, Traefik-Routing, Netzwerk-Verbindungen und behebt Probleme
|
||||
- name: Diagnose Gitea Timeouts
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
|
||||
tasks:
|
||||
- name: Check Gitea container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose ps gitea
|
||||
register: gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display Gitea container status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Container Status:
|
||||
================================================================================
|
||||
{{ gitea_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Gitea health endpoint (direct from container)
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose exec -T gitea curl -f http://localhost:3000/api/healthz 2>&1 || echo "HEALTH_CHECK_FAILED"
|
||||
register: gitea_health_direct
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea health (direct)
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Health Check (direct from container):
|
||||
================================================================================
|
||||
{% if 'HEALTH_CHECK_FAILED' not in gitea_health_direct.stdout %}
|
||||
✅ Gitea is healthy (direct check)
|
||||
Response: {{ gitea_health_direct.stdout }}
|
||||
{% else %}
|
||||
❌ Gitea health check failed (direct)
|
||||
Error: {{ gitea_health_direct.stdout }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Gitea health endpoint (via Traefik)
|
||||
ansible.builtin.uri:
|
||||
url: "https://git.michaelschiemer.de/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_traefik
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Display Gitea health (via Traefik)
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Health Check (via Traefik):
|
||||
================================================================================
|
||||
{% if gitea_health_traefik.status == 200 %}
|
||||
✅ Gitea is reachable via Traefik
|
||||
Status: {{ gitea_health_traefik.status }}
|
||||
{% else %}
|
||||
❌ Gitea is NOT reachable via Traefik
|
||||
Status: {{ gitea_health_traefik.status | default('TIMEOUT/ERROR') }}
|
||||
Message: {{ gitea_health_traefik.msg | default('No response') }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose ps traefik
|
||||
register: traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik container status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Container Status:
|
||||
================================================================================
|
||||
{{ traefik_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Redis container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose ps redis
|
||||
register: redis_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display Redis container status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Redis Container Status:
|
||||
================================================================================
|
||||
{{ redis_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check PostgreSQL container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose ps postgres
|
||||
register: postgres_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display PostgreSQL container status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
PostgreSQL Container Status:
|
||||
================================================================================
|
||||
{{ postgres_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Gitea container IP in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker inspect gitea --format '{{ '{{' }}range .NetworkSettings.Networks{{ '}}' }}{{ '{{' }}if eq .NetworkID (docker network inspect traefik-public --format "{{ '{{' }}.Id{{ '}}' }}"){{ '}}' }}{{ '{{' }}.IPAddress{{ '}}' }}{{ '{{' }}end{{ '}}' }}{{ '{{' }}end{{ '}}' }}' 2>/dev/null || echo "NOT_FOUND"
|
||||
register: gitea_ip
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea IP in traefik-public network
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea IP in traefik-public Network:
|
||||
================================================================================
|
||||
{% if gitea_ip.stdout and gitea_ip.stdout != 'NOT_FOUND' %}
|
||||
✅ Gitea IP: {{ gitea_ip.stdout }}
|
||||
{% else %}
|
||||
❌ Gitea IP not found in traefik-public network
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Test connection from Traefik to Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose exec -T traefik wget -qO- --timeout=5 http://gitea:3000/api/healthz 2>&1 || echo "CONNECTION_FAILED"
|
||||
register: traefik_gitea_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Traefik-Gitea connection test
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik → Gitea Connection Test:
|
||||
================================================================================
|
||||
{% if 'CONNECTION_FAILED' in traefik_gitea_connection.stdout %}
|
||||
❌ Traefik cannot reach Gitea
|
||||
Error: {{ traefik_gitea_connection.stdout }}
|
||||
{% else %}
|
||||
✅ Traefik can reach Gitea
|
||||
Response: {{ traefik_gitea_connection.stdout }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik routing configuration for Gitea
|
||||
ansible.builtin.shell: |
|
||||
docker inspect gitea --format '{{ '{{' }}json .Config.Labels{{ '}}' }}' 2>/dev/null | grep -i "traefik" || echo "NO_TRAEFIK_LABELS"
|
||||
register: traefik_labels
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Traefik labels for Gitea
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Labels for Gitea:
|
||||
================================================================================
|
||||
{{ traefik_labels.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Gitea logs for errors
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose logs gitea --tail=50 2>&1 | grep -iE "error|timeout|failed|panic|fatal" | tail -20 || echo "No errors in recent logs"
|
||||
register: gitea_errors
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea errors
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Error Logs (last 50 lines):
|
||||
================================================================================
|
||||
{{ gitea_errors.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik logs for Gitea-related errors
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose logs traefik --tail=50 2>&1 | grep -iE "gitea|git\.michaelschiemer\.de|timeout|error" | tail -20 || echo "No Gitea-related errors in Traefik logs"
|
||||
register: traefik_gitea_errors
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Traefik Gitea errors
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Gitea-Related Error Logs (last 50 lines):
|
||||
================================================================================
|
||||
{{ traefik_gitea_errors.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check if Gitea is in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker network inspect traefik-public --format '{{ '{{' }}range .Containers{{ '}}' }}{{ '{{' }}.Name{{ '}}' }} {{ '{{' }}end{{ '}}' }}' 2>/dev/null | grep -q gitea && echo "YES" || echo "NO"
|
||||
register: gitea_in_traefik_network
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea network membership
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea in traefik-public Network:
|
||||
================================================================================
|
||||
{% if gitea_in_traefik_network.stdout == 'YES' %}
|
||||
✅ Gitea is in traefik-public network
|
||||
{% else %}
|
||||
❌ Gitea is NOT in traefik-public network
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Redis connection from Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose exec -T gitea sh -c "redis-cli -h redis -p 6379 -a gitea_redis_password ping 2>&1" || echo "REDIS_CONNECTION_FAILED"
|
||||
register: gitea_redis_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea-Redis connection
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea → Redis Connection:
|
||||
================================================================================
|
||||
{% if 'REDIS_CONNECTION_FAILED' in gitea_redis_connection.stdout %}
|
||||
❌ Gitea cannot connect to Redis
|
||||
Error: {{ gitea_redis_connection.stdout }}
|
||||
{% else %}
|
||||
✅ Gitea can connect to Redis
|
||||
Response: {{ gitea_redis_connection.stdout }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check PostgreSQL connection from Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose exec -T gitea sh -c "pg_isready -h postgres -p 5432 -U gitea 2>&1" || echo "POSTGRES_CONNECTION_FAILED"
|
||||
register: gitea_postgres_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea-PostgreSQL connection
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea → PostgreSQL Connection:
|
||||
================================================================================
|
||||
{% if 'POSTGRES_CONNECTION_FAILED' in gitea_postgres_connection.stdout %}
|
||||
❌ Gitea cannot connect to PostgreSQL
|
||||
Error: {{ gitea_postgres_connection.stdout }}
|
||||
{% else %}
|
||||
✅ Gitea can connect to PostgreSQL
|
||||
Response: {{ gitea_postgres_connection.stdout }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Summary and recommendations
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Gitea Timeout Diagnose:
|
||||
================================================================================
|
||||
|
||||
Gitea Status: {{ gitea_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
Gitea Health (direct): {% if 'HEALTH_CHECK_FAILED' not in gitea_health_direct.stdout %}✅{% else %}❌{% endif %}
|
||||
Gitea Health (via Traefik): {% if gitea_health_traefik.status == 200 %}✅{% else %}❌{% endif %}
|
||||
Traefik Status: {{ traefik_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
Redis Status: {{ redis_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
PostgreSQL Status: {{ postgres_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
|
||||
Netzwerk:
|
||||
- Gitea in traefik-public: {% if gitea_in_traefik_network.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
- Traefik → Gitea: {% if 'CONNECTION_FAILED' not in traefik_gitea_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
- Gitea → Redis: {% if 'REDIS_CONNECTION_FAILED' not in gitea_redis_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
- Gitea → PostgreSQL: {% if 'POSTGRES_CONNECTION_FAILED' not in gitea_postgres_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
|
||||
Empfohlene Aktionen:
|
||||
{% if gitea_health_traefik.status != 200 %}
|
||||
1. ❌ Gitea ist nicht über Traefik erreichbar
|
||||
→ Führe 'fix-gitea-timeouts.yml' aus um Gitea und Traefik zu restarten
|
||||
{% endif %}
|
||||
{% if gitea_in_traefik_network.stdout != 'YES' %}
|
||||
2. ❌ Gitea ist nicht im traefik-public Netzwerk
|
||||
→ Gitea Container neu starten um Netzwerk-Verbindung zu aktualisieren
|
||||
{% endif %}
|
||||
{% if 'CONNECTION_FAILED' in traefik_gitea_connection.stdout %}
|
||||
3. ❌ Traefik kann Gitea nicht erreichen
|
||||
→ Beide Container neu starten
|
||||
{% endif %}
|
||||
{% if 'REDIS_CONNECTION_FAILED' in gitea_redis_connection.stdout %}
|
||||
4. ❌ Gitea kann Redis nicht erreichen
|
||||
→ Redis Container prüfen und neu starten
|
||||
{% endif %}
|
||||
{% if 'POSTGRES_CONNECTION_FAILED' in gitea_postgres_connection.stdout %}
|
||||
5. ❌ Gitea kann PostgreSQL nicht erreichen
|
||||
→ PostgreSQL Container prüfen und neu starten
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
@@ -1,477 +0,0 @@
|
||||
---
|
||||
# Diagnose: Finde Ursache für Traefik Restart-Loop
|
||||
# Prüft alle möglichen Ursachen für regelmäßige Traefik-Restarts
|
||||
- name: Diagnose Traefik Restart Loop
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check systemd timers
|
||||
ansible.builtin.shell: |
|
||||
systemctl list-timers --all --no-pager
|
||||
register: systemd_timers
|
||||
changed_when: false
|
||||
|
||||
- name: Display systemd timers
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Systemd Timers (können Container stoppen):
|
||||
================================================================================
|
||||
{{ systemd_timers.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check root crontab
|
||||
ansible.builtin.shell: |
|
||||
crontab -l 2>/dev/null || echo "No root crontab"
|
||||
register: root_crontab
|
||||
changed_when: false
|
||||
|
||||
- name: Display root crontab
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Root Crontab:
|
||||
================================================================================
|
||||
{{ root_crontab.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check deploy user crontab
|
||||
ansible.builtin.shell: |
|
||||
crontab -l -u deploy 2>/dev/null || echo "No deploy user crontab"
|
||||
register: deploy_crontab
|
||||
changed_when: false
|
||||
|
||||
- name: Display deploy user crontab
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Deploy User Crontab:
|
||||
================================================================================
|
||||
{{ deploy_crontab.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check system-wide cron jobs
|
||||
ansible.builtin.shell: |
|
||||
echo "=== /etc/cron.d ==="
|
||||
ls -la /etc/cron.d 2>/dev/null || echo "Directory not found"
|
||||
grep -r "traefik\|docker.*compose.*traefik\|docker.*stop\|docker.*restart" /etc/cron.d 2>/dev/null || echo "No matches"
|
||||
echo ""
|
||||
echo "=== /etc/cron.daily ==="
|
||||
ls -la /etc/cron.daily 2>/dev/null || echo "Directory not found"
|
||||
grep -r "traefik\|docker.*compose.*traefik\|docker.*stop\|docker.*restart" /etc/cron.daily 2>/dev/null || echo "No matches"
|
||||
echo ""
|
||||
echo "=== /etc/cron.hourly ==="
|
||||
ls -la /etc/cron.hourly 2>/dev/null || echo "Directory not found"
|
||||
grep -r "traefik\|docker.*compose.*traefik\|docker.*stop\|docker.*restart" /etc/cron.hourly 2>/dev/null || echo "No matches"
|
||||
echo ""
|
||||
echo "=== /etc/cron.weekly ==="
|
||||
ls -la /etc/cron.weekly 2>/dev/null || echo "Directory not found"
|
||||
grep -r "traefik\|docker.*compose.*traefik\|docker.*stop\|docker.*restart" /etc/cron.weekly 2>/dev/null || echo "No matches"
|
||||
echo ""
|
||||
echo "=== /etc/cron.monthly ==="
|
||||
ls -la /etc/cron.monthly 2>/dev/null || echo "Directory not found"
|
||||
grep -r "traefik\|docker.*compose.*traefik\|docker.*stop\|docker.*restart" /etc/cron.monthly 2>/dev/null || echo "No matches"
|
||||
register: system_cron
|
||||
changed_when: false
|
||||
|
||||
- name: Display system cron jobs
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
System-Wide Cron Jobs:
|
||||
================================================================================
|
||||
{{ system_cron.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for scripts that might restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -name "*.sh" -exec grep -l "traefik\|docker.*compose.*restart\|docker.*stop.*traefik\|docker.*down.*traefik" {} \; 2>/dev/null | head -20
|
||||
register: traefik_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Display scripts that might restart Traefik
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Scripts die Traefik stoppen/restarten könnten:
|
||||
================================================================================
|
||||
{% if traefik_scripts.stdout %}
|
||||
{{ traefik_scripts.stdout }}
|
||||
{% else %}
|
||||
Keine Skripte gefunden
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Docker events for Traefik container (last 24h)
|
||||
ansible.builtin.shell: |
|
||||
timeout 5 docker events --since 24h --filter container=traefik --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.name{{ '}}' }}" 2>/dev/null | tail -50 || echo "No recent events or docker events not available"
|
||||
register: docker_events
|
||||
changed_when: false
|
||||
|
||||
- name: Display Docker events
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Events für Traefik (letzte 24h):
|
||||
================================================================================
|
||||
{{ docker_events.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik container exit history
|
||||
ansible.builtin.shell: |
|
||||
docker ps -a --filter "name=traefik" --format "{{ '{{' }}.ID{{ '}}' }} | {{ '{{' }}.Status{{ '}}' }} | {{ '{{' }}.CreatedAt{{ '}}' }}" | head -10
|
||||
register: traefik_exits
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik container exit history
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Container Exit-Historie:
|
||||
================================================================================
|
||||
{{ traefik_exits.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Docker daemon logs for Traefik stops
|
||||
ansible.builtin.shell: |
|
||||
journalctl -u docker.service --since "24h ago" --no-pager | grep -i "traefik\|stop\|kill" | tail -50 || echo "No relevant logs in journalctl"
|
||||
register: docker_daemon_logs
|
||||
changed_when: false
|
||||
|
||||
- name: Display Docker daemon logs
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Daemon Logs (Traefik/Stop/Kill):
|
||||
================================================================================
|
||||
{{ docker_daemon_logs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check if there's a health check script running
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep -E "traefik|health.*check|monitor.*docker|auto.*heal|watchdog" | grep -v grep || echo "No health check processes found"
|
||||
register: health_check_processes
|
||||
changed_when: false
|
||||
|
||||
- name: Display health check processes
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Laufende Health-Check/Monitoring-Prozesse:
|
||||
================================================================================
|
||||
{{ health_check_processes.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for monitoring/auto-heal scripts
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f \( -name "*monitor*" -o -name "*health*" -o -name "*auto*heal*" -o -name "*watchdog*" \) 2>/dev/null | head -20
|
||||
register: monitoring_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Display monitoring scripts
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Monitoring/Auto-Heal-Skripte:
|
||||
================================================================================
|
||||
{% if monitoring_scripts.stdout %}
|
||||
{{ monitoring_scripts.stdout }}
|
||||
{% else %}
|
||||
Keine Monitoring-Skripte gefunden
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Docker Compose file for restart policies
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik && grep -A 5 "restart:" docker-compose.yml || echo "No restart policy found"
|
||||
register: restart_policy
|
||||
changed_when: false
|
||||
|
||||
- name: Display restart policy
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Compose Restart Policy:
|
||||
================================================================================
|
||||
{{ restart_policy.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check if Traefik is managed by systemd
|
||||
ansible.builtin.shell: |
|
||||
systemctl list-units --type=service --all | grep -i traefik || echo "No Traefik systemd service found"
|
||||
register: traefik_systemd
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik systemd service
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Systemd Service:
|
||||
================================================================================
|
||||
{{ traefik_systemd.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check recent Traefik container logs for stop messages
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik && docker compose logs traefik --since 24h 2>&1 | grep -E "I have to go|Stopping server gracefully|SIGTERM|SIGINT|received signal" | tail -20 || echo "No stop messages in logs"
|
||||
register: traefik_stop_logs
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik stop messages
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Stop-Meldungen (letzte 24h):
|
||||
================================================================================
|
||||
{{ traefik_stop_logs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik container uptime and restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.StartedAt{{ '}}' }} | {{ '{{' }}.State.FinishedAt{{ '}}' }} | Restarts: {{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "Container not found"
|
||||
register: traefik_uptime
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik uptime and restart count
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Container Uptime & Restart Count:
|
||||
================================================================================
|
||||
{{ traefik_uptime.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for unattended-upgrades activity
|
||||
ansible.builtin.shell: |
|
||||
journalctl -u unattended-upgrades --since "24h ago" --no-pager | tail -20 || echo "No unattended-upgrades logs"
|
||||
register: unattended_upgrades
|
||||
changed_when: false
|
||||
|
||||
- name: Display unattended-upgrades activity
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Unattended-Upgrades Aktivität (kann zu Reboots führen):
|
||||
================================================================================
|
||||
{{ unattended_upgrades.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check system reboot history
|
||||
ansible.builtin.shell: |
|
||||
last reboot | head -10 || echo "No reboot history available"
|
||||
register: reboot_history
|
||||
changed_when: false
|
||||
|
||||
- name: Display reboot history
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
System Reboot-Historie:
|
||||
================================================================================
|
||||
{{ reboot_history.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Docker Compose processes that might affect Traefik
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep -E "docker.*compose.*traefik|docker-compose.*traefik" | grep -v grep || echo "No docker compose processes for Traefik found"
|
||||
register: docker_compose_processes
|
||||
changed_when: false
|
||||
|
||||
- name: Display Docker Compose processes
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Compose Prozesse für Traefik:
|
||||
================================================================================
|
||||
{{ docker_compose_processes.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check all user crontabs (not just root/deploy)
|
||||
ansible.builtin.shell: |
|
||||
for user in $(cut -f1 -d: /etc/passwd); do
|
||||
crontab -u "$user" -l 2>/dev/null | grep -q "traefik\|docker.*compose.*traefik\|docker.*restart.*traefik" && echo "=== User: $user ===" && crontab -u "$user" -l 2>/dev/null | grep -E "traefik|docker.*compose.*traefik|docker.*restart.*traefik" || true
|
||||
done || echo "No user crontabs with Traefik commands found"
|
||||
register: all_user_crontabs
|
||||
changed_when: false
|
||||
|
||||
- name: Display all user crontabs with Traefik commands
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Alle User-Crontabs mit Traefik-Befehlen:
|
||||
================================================================================
|
||||
{{ all_user_crontabs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for Gitea Workflows that might restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -path "*/.gitea/workflows/*.yml" -o -path "*/.github/workflows/*.yml" 2>/dev/null | xargs grep -l "traefik\|restart.*traefik\|docker.*compose.*traefik" 2>/dev/null | head -10 || echo "No Gitea/GitHub workflows found that restart Traefik"
|
||||
register: gitea_workflows
|
||||
changed_when: false
|
||||
|
||||
- name: Display Gitea Workflows that might restart Traefik
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea/GitHub Workflows die Traefik restarten könnten:
|
||||
================================================================================
|
||||
{{ gitea_workflows.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for custom systemd services in /etc/systemd/system/
|
||||
ansible.builtin.shell: |
|
||||
find /etc/systemd/system -type f -name "*.service" -o -name "*.timer" 2>/dev/null | xargs grep -l "traefik\|docker.*compose.*traefik\|docker.*restart.*traefik" 2>/dev/null | head -10 || echo "No custom systemd services/timers found for Traefik"
|
||||
register: custom_systemd_services
|
||||
changed_when: false
|
||||
|
||||
- name: Display custom systemd services
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Custom Systemd Services/Timers für Traefik:
|
||||
================================================================================
|
||||
{{ custom_systemd_services.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for at jobs (scheduled tasks)
|
||||
ansible.builtin.shell: |
|
||||
atq 2>/dev/null | while read line; do
|
||||
job_id=$(echo "$line" | awk '{print $1}')
|
||||
at -c "$job_id" 2>/dev/null | grep -q "traefik\|docker.*compose.*traefik\|docker.*restart.*traefik" && echo "=== Job ID: $job_id ===" && at -c "$job_id" 2>/dev/null | grep -E "traefik|docker.*compose.*traefik|docker.*restart.*traefik" || true
|
||||
done || echo "No at jobs found or atq not available"
|
||||
register: at_jobs
|
||||
changed_when: false
|
||||
|
||||
- name: Display at jobs
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
At Jobs (geplante Tasks) die Traefik betreffen:
|
||||
================================================================================
|
||||
{{ at_jobs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for Docker Compose watch mode
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik && docker compose ps --format json 2>/dev/null | jq -r '.[] | select(.Service=="traefik") | .State' || echo "Could not check Docker Compose watch mode"
|
||||
register: docker_compose_watch
|
||||
changed_when: false
|
||||
|
||||
- name: Check if Docker Compose watch is enabled
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik && docker compose config 2>/dev/null | grep -i "watch\|x-develop" || echo "No watch mode configured"
|
||||
register: docker_compose_watch_config
|
||||
changed_when: false
|
||||
|
||||
- name: Display Docker Compose watch mode
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Compose Watch Mode:
|
||||
================================================================================
|
||||
Watch Config: {{ docker_compose_watch_config.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Ansible traefik_auto_restart setting
|
||||
ansible.builtin.shell: |
|
||||
grep -r "traefik_auto_restart" /home/deploy/deployment/ansible/roles/traefik/defaults/ /home/deploy/deployment/ansible/inventory/ 2>/dev/null | head -10 || echo "traefik_auto_restart not found in Ansible config"
|
||||
register: ansible_auto_restart
|
||||
changed_when: false
|
||||
|
||||
- name: Display Ansible traefik_auto_restart setting
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Ansible traefik_auto_restart Einstellung:
|
||||
================================================================================
|
||||
{{ ansible_auto_restart.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Port 80/443 configuration
|
||||
ansible.builtin.shell: |
|
||||
echo "=== Port 80 ==="
|
||||
netstat -tlnp 2>/dev/null | grep ":80 " || ss -tlnp 2>/dev/null | grep ":80 " || echo "Could not check port 80"
|
||||
echo ""
|
||||
echo "=== Port 443 ==="
|
||||
netstat -tlnp 2>/dev/null | grep ":443 " || ss -tlnp 2>/dev/null | grep ":443 " || echo "Could not check port 443"
|
||||
echo ""
|
||||
echo "=== Docker Port Mappings for Traefik ==="
|
||||
docker inspect traefik --format '{{ '{{' }}json .HostConfig.PortBindings{{ '}}' }}' 2>/dev/null | jq '.' || echo "Could not get Docker port mappings"
|
||||
register: port_config
|
||||
changed_when: false
|
||||
|
||||
- name: Display Port configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Port-Konfiguration (80/443):
|
||||
================================================================================
|
||||
{{ port_config.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check if other services are blocking ports 80/443
|
||||
ansible.builtin.shell: |
|
||||
echo "=== Services listening on port 80 ==="
|
||||
lsof -i :80 2>/dev/null || fuser 80/tcp 2>/dev/null || echo "Could not check port 80"
|
||||
echo ""
|
||||
echo "=== Services listening on port 443 ==="
|
||||
lsof -i :443 2>/dev/null || fuser 443/tcp 2>/dev/null || echo "Could not check port 443"
|
||||
register: port_blockers
|
||||
changed_when: false
|
||||
|
||||
- name: Display port blockers
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Services die Ports 80/443 blockieren könnten:
|
||||
================================================================================
|
||||
{{ port_blockers.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik network configuration
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}json .NetworkSettings{{ '}}' }}' 2>/dev/null | jq '.Networks' || echo "Could not get Traefik network configuration"
|
||||
register: traefik_network
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik network configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Netzwerk-Konfiguration:
|
||||
================================================================================
|
||||
{{ traefik_network.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Summary - Most likely causes
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Mögliche Ursachen für Traefik-Restarts:
|
||||
================================================================================
|
||||
|
||||
Prüfe die obigen Ausgaben auf:
|
||||
|
||||
1. Systemd-Timer: Können Container stoppen (z.B. unattended-upgrades)
|
||||
2. Cronjobs: Regelmäßige Skripte die Traefik stoppen (alle User-Crontabs geprüft)
|
||||
3. Docker-Events: Zeigen wer/was den Container stoppt
|
||||
4. Monitoring-Skripte: Auto-Heal-Skripte die bei Fehlern restarten
|
||||
5. Unattended-Upgrades: Können zu Reboots führen
|
||||
6. Reboot-Historie: System-Reboots stoppen alle Container
|
||||
7. Gitea Workflows: Können Traefik via Ansible restarten
|
||||
8. Custom Systemd Services: Eigene Services die Traefik verwalten
|
||||
9. At Jobs: Geplante Tasks die Traefik stoppen
|
||||
10. Docker Compose Watch Mode: Automatische Restarts bei Dateiänderungen
|
||||
11. Ansible traefik_auto_restart: Automatische Restarts nach Config-Deployment
|
||||
12. Port-Konfiguration: Ports 80/443 müssen auf Traefik zeigen
|
||||
|
||||
Nächste Schritte:
|
||||
- Prüfe die Docker-Events für wiederkehrende Muster
|
||||
- Prüfe alle User-Crontabs auf regelmäßige Traefik-Befehle
|
||||
- Prüfe ob Monitoring-Skripte zu aggressiv sind
|
||||
- Prüfe ob unattended-upgrades zu Reboots führt
|
||||
- Prüfe ob traefik_auto_restart zu häufigen Restarts führt
|
||||
- Verifiziere Port-Konfiguration (80/443)
|
||||
================================================================================
|
||||
403
deployment/ansible/playbooks/diagnose/gitea.yml
Normal file
403
deployment/ansible/playbooks/diagnose/gitea.yml
Normal file
@@ -0,0 +1,403 @@
|
||||
---
|
||||
# Consolidated Gitea Diagnosis Playbook
|
||||
# Consolidates: diagnose-gitea-timeouts.yml, diagnose-gitea-timeout-deep.yml,
|
||||
# diagnose-gitea-timeout-live.yml, diagnose-gitea-timeouts-complete.yml,
|
||||
# comprehensive-gitea-diagnosis.yml
|
||||
#
|
||||
# Usage:
|
||||
# # Basic diagnosis (default)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml
|
||||
#
|
||||
# # Deep diagnosis (includes resource checks, multiple connection tests)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags deep
|
||||
#
|
||||
# # Live diagnosis (monitors during request)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags live
|
||||
#
|
||||
# # Complete diagnosis (all checks)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml --tags complete
|
||||
|
||||
- name: Diagnose Gitea Issues
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
gitea_container_name: "gitea"
|
||||
traefik_container_name: "traefik"
|
||||
|
||||
tasks:
|
||||
# ========================================
|
||||
# BASIC DIAGNOSIS (always runs)
|
||||
# ========================================
|
||||
- name: Display diagnostic plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA DIAGNOSIS
|
||||
================================================================================
|
||||
|
||||
Running diagnosis with tags: {{ ansible_run_tags | default(['all']) }}
|
||||
|
||||
Basic checks (always):
|
||||
- Container status
|
||||
- Health endpoints
|
||||
- Network connectivity
|
||||
- Service discovery
|
||||
|
||||
Deep checks (--tags deep):
|
||||
- Resource usage
|
||||
- Multiple connection tests
|
||||
- Log analysis
|
||||
|
||||
Live checks (--tags live):
|
||||
- Real-time monitoring during request
|
||||
|
||||
Complete checks (--tags complete):
|
||||
- All checks including app.ini, ServersTransport, etc.
|
||||
|
||||
================================================================================
|
||||
|
||||
- name: Check Gitea container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps {{ gitea_container_name }}
|
||||
register: gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps {{ traefik_container_name }}
|
||||
register: traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea health endpoint (direct from container)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} curl -f http://localhost:3000/api/healthz 2>&1 || echo "HEALTH_CHECK_FAILED"
|
||||
register: gitea_health_direct
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Gitea health endpoint (via Traefik)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_traefik
|
||||
failed_when: false
|
||||
changed_when: false
|
||||
|
||||
- name: Check if Gitea is in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker network inspect traefik-public --format '{{ '{{' }}range .Containers{{ '}}' }}{{ '{{' }}.Name{{ '}}' }} {{ '{{' }}end{{ '}}' }}' 2>/dev/null | grep -q {{ gitea_container_name }} && echo "YES" || echo "NO"
|
||||
register: gitea_in_traefik_network
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test connection from Traefik to Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T {{ traefik_container_name }} wget -qO- --timeout=5 http://{{ gitea_container_name }}:3000/api/healthz 2>&1 || echo "CONNECTION_FAILED"
|
||||
register: traefik_gitea_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik service discovery for Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T {{ traefik_container_name }} traefik show providers docker 2>/dev/null | grep -i "gitea" || echo "NOT_FOUND"
|
||||
register: traefik_gitea_service
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# DEEP DIAGNOSIS (--tags deep)
|
||||
# ========================================
|
||||
- name: Check Gitea container resources (CPU/Memory)
|
||||
ansible.builtin.shell: |
|
||||
docker stats {{ gitea_container_name }} --no-stream --format 'CPU: {{ '{{' }}.CPUPerc{{ '}}' }} | Memory: {{ '{{' }}.MemUsage{{ '}}' }}' 2>/dev/null || echo "Could not get stats"
|
||||
register: gitea_resources
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
- name: Check Traefik container resources (CPU/Memory)
|
||||
ansible.builtin.shell: |
|
||||
docker stats {{ traefik_container_name }} --no-stream --format 'CPU: {{ '{{' }}.CPUPerc{{ '}}' }} | Memory: {{ '{{' }}.MemUsage{{ '}}' }}' 2>/dev/null || echo "Could not get stats"
|
||||
register: traefik_resources
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
- name: Test Gitea direct connection (multiple attempts)
|
||||
ansible.builtin.shell: |
|
||||
for i in {1..5}; do
|
||||
echo "=== Attempt $i ==="
|
||||
cd {{ gitea_stack_path }}
|
||||
timeout 5 docker compose exec -T {{ gitea_container_name }} curl -f http://localhost:3000/api/healthz 2>&1 || echo "FAILED"
|
||||
sleep 1
|
||||
done
|
||||
register: gitea_direct_tests
|
||||
changed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
- name: Test Gitea via Traefik (multiple attempts)
|
||||
ansible.builtin.shell: |
|
||||
for i in {1..5}; do
|
||||
echo "=== Attempt $i ==="
|
||||
timeout 10 curl -k -s -o /dev/null -w "%{http_code}" {{ gitea_url }}/api/healthz 2>&1 || echo "TIMEOUT"
|
||||
sleep 2
|
||||
done
|
||||
register: gitea_traefik_tests
|
||||
changed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
- name: Check Gitea logs for errors/timeouts
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose logs {{ gitea_container_name }} --tail=50 2>&1 | grep -iE "error|timeout|failed|panic|fatal" | tail -20 || echo "No errors in recent logs"
|
||||
register: gitea_errors
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
- name: Check Traefik logs for Gitea-related errors
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs {{ traefik_container_name }} --tail=50 2>&1 | grep -iE "gitea|git\.michaelschiemer\.de|timeout|error" | tail -20 || echo "No Gitea-related errors in Traefik logs"
|
||||
register: traefik_gitea_errors
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- deep
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# COMPLETE DIAGNOSIS (--tags complete)
|
||||
# ========================================
|
||||
- name: Test Gitea internal port (127.0.0.1:3000)
|
||||
ansible.builtin.shell: |
|
||||
docker exec {{ gitea_container_name }} curl -sS -I http://127.0.0.1:3000/ 2>&1 | head -5
|
||||
register: gitea_internal_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Test Traefik to Gitea via Docker DNS (gitea:3000)
|
||||
ansible.builtin.shell: |
|
||||
docker exec {{ traefik_container_name }} sh -lc 'apk add --no-cache curl >/dev/null 2>&1 || true; curl -sS -I http://gitea:3000/ 2>&1' | head -10
|
||||
register: traefik_gitea_dns_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check Traefik logs for 504 errors
|
||||
ansible.builtin.shell: |
|
||||
docker logs {{ traefik_container_name }} --tail=100 2>&1 | grep -i "504\|timeout" | tail -20 || echo "No 504/timeout errors found"
|
||||
register: traefik_504_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check Gitea Traefik labels
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ gitea_container_name }} --format '{{ '{{' }}json .Config.Labels{{ '}}' }}' 2>/dev/null | python3 -m json.tool | grep -E "traefik" || echo "No Traefik labels found"
|
||||
register: gitea_labels
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Verify service port is 3000
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ gitea_container_name }} --format '{{ '{{' }}json .Config.Labels{{ '}}' }}' 2>/dev/null | python3 -c "import sys, json; labels = json.load(sys.stdin); print('server.port:', labels.get('traefik.http.services.gitea.loadbalancer.server.port', 'NOT SET'))"
|
||||
register: gitea_service_port
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check ServersTransport configuration
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ gitea_container_name }} --format '{{ '{{' }}json .Config.Labels{{ '}}' }}' 2>/dev/null | python3 -c "
|
||||
import sys, json
|
||||
labels = json.load(sys.stdin)
|
||||
transport = labels.get('traefik.http.services.gitea.loadbalancer.serversTransport', '')
|
||||
if transport:
|
||||
print('ServersTransport:', transport)
|
||||
print('dialtimeout:', labels.get('traefik.http.serverstransports.gitea-transport.forwardingtimeouts.dialtimeout', 'NOT SET'))
|
||||
print('responseheadertimeout:', labels.get('traefik.http.serverstransports.gitea-transport.forwardingtimeouts.responseheadertimeout', 'NOT SET'))
|
||||
print('idleconntimeout:', labels.get('traefik.http.serverstransports.gitea-transport.forwardingtimeouts.idleconntimeout', 'NOT SET'))
|
||||
print('maxidleconnsperhost:', labels.get('traefik.http.serverstransports.gitea-transport.maxidleconnsperhost', 'NOT SET'))
|
||||
else:
|
||||
print('ServersTransport: NOT CONFIGURED')
|
||||
"
|
||||
register: gitea_timeout_config
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check Gitea app.ini proxy settings
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} cat /data/gitea/conf/app.ini 2>/dev/null | grep -E "PROXY_TRUSTED_PROXIES|LOCAL_ROOT_URL|COOKIE_SECURE|SAME_SITE" || echo "Proxy settings not found in app.ini"
|
||||
register: gitea_proxy_settings
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check if Traefik can resolve Gitea hostname
|
||||
ansible.builtin.shell: |
|
||||
docker exec {{ traefik_container_name }} getent hosts {{ gitea_container_name }} || echo "DNS resolution failed"
|
||||
register: traefik_dns_resolution
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check Docker networks for Gitea and Traefik
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ gitea_container_name }} --format '{{ '{{' }}json .NetworkSettings.Networks{{ '}}' }}' | python3 -c "import sys, json; data=json.load(sys.stdin); print('Gitea networks:', list(data.keys()))"
|
||||
docker inspect {{ traefik_container_name }} --format '{{ '{{' }}json .NetworkSettings.Networks{{ '}}' }}' | python3 -c "import sys, json; data=json.load(sys.stdin); print('Traefik networks:', list(data.keys()))"
|
||||
register: docker_networks_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Test long-running endpoint from external
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/user/events"
|
||||
method: GET
|
||||
status_code: [200, 504]
|
||||
validate_certs: false
|
||||
timeout: 60
|
||||
register: long_running_endpoint_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check Redis connection from Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} sh -c "redis-cli -h redis -a {{ vault_gitea_redis_password | default('gitea_redis_password') }} ping 2>&1" || echo "REDIS_CONNECTION_FAILED"
|
||||
register: gitea_redis_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Check PostgreSQL connection from Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} sh -c "pg_isready -h postgres -p 5432 -U gitea 2>&1" || echo "POSTGRES_CONNECTION_FAILED"
|
||||
register: gitea_postgres_connection
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# SUMMARY
|
||||
# ========================================
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA DIAGNOSIS SUMMARY
|
||||
================================================================================
|
||||
|
||||
Container Status:
|
||||
- Gitea: {{ gitea_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
- Traefik: {{ traefik_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
|
||||
Health Checks:
|
||||
- Gitea (direct): {% if 'HEALTH_CHECK_FAILED' not in gitea_health_direct.stdout %}✅{% else %}❌{% endif %}
|
||||
- Gitea (via Traefik): {% if gitea_health_traefik.status == 200 %}✅{% else %}❌ (Status: {{ gitea_health_traefik.status | default('TIMEOUT') }}){% endif %}
|
||||
|
||||
Network:
|
||||
- Gitea in traefik-public: {% if gitea_in_traefik_network.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
- Traefik → Gitea: {% if 'CONNECTION_FAILED' not in traefik_gitea_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
|
||||
Service Discovery:
|
||||
- Traefik finds Gitea: {% if 'NOT_FOUND' not in traefik_gitea_service.stdout %}✅{% else %}❌{% endif %}
|
||||
|
||||
{% if 'deep' in ansible_run_tags or 'complete' in ansible_run_tags %}
|
||||
Resources:
|
||||
- Gitea: {{ gitea_resources.stdout | default('N/A') }}
|
||||
- Traefik: {{ traefik_resources.stdout | default('N/A') }}
|
||||
|
||||
Connection Tests:
|
||||
- Direct (5 attempts): {{ gitea_direct_tests.stdout | default('N/A') }}
|
||||
- Via Traefik (5 attempts): {{ gitea_traefik_tests.stdout | default('N/A') }}
|
||||
|
||||
Error Logs:
|
||||
- Gitea: {{ gitea_errors.stdout | default('No errors') }}
|
||||
- Traefik: {{ traefik_gitea_errors.stdout | default('No errors') }}
|
||||
{% endif %}
|
||||
|
||||
{% if 'complete' in ansible_run_tags %}
|
||||
Configuration:
|
||||
- Service Port: {{ gitea_service_port.stdout | default('N/A') }}
|
||||
- ServersTransport: {{ gitea_timeout_config.stdout | default('N/A') }}
|
||||
- Proxy Settings: {{ gitea_proxy_settings.stdout | default('N/A') }}
|
||||
- DNS Resolution: {{ traefik_dns_resolution.stdout | default('N/A') }}
|
||||
- Networks: {{ docker_networks_check.stdout | default('N/A') }}
|
||||
|
||||
Long-Running Endpoint:
|
||||
- Status: {{ long_running_endpoint_test.status | default('N/A') }}
|
||||
|
||||
Dependencies:
|
||||
- Redis: {% if 'REDIS_CONNECTION_FAILED' not in gitea_redis_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
- PostgreSQL: {% if 'POSTGRES_CONNECTION_FAILED' not in gitea_postgres_connection.stdout %}✅{% else %}❌{% endif %}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
RECOMMENDATIONS
|
||||
================================================================================
|
||||
|
||||
{% if gitea_health_traefik.status != 200 %}
|
||||
❌ Gitea is not reachable via Traefik
|
||||
→ Run: ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags restart
|
||||
{% endif %}
|
||||
|
||||
{% if gitea_in_traefik_network.stdout != 'YES' %}
|
||||
❌ Gitea is not in traefik-public network
|
||||
→ Restart Gitea container to update network membership
|
||||
{% endif %}
|
||||
|
||||
{% if 'CONNECTION_FAILED' in traefik_gitea_connection.stdout %}
|
||||
❌ Traefik cannot reach Gitea
|
||||
→ Restart both containers
|
||||
{% endif %}
|
||||
|
||||
{% if 'NOT_FOUND' in traefik_gitea_service.stdout %}
|
||||
❌ Gitea not found in Traefik service discovery
|
||||
→ Restart Traefik to refresh service discovery
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
|
||||
229
deployment/ansible/playbooks/diagnose/traefik.yml
Normal file
229
deployment/ansible/playbooks/diagnose/traefik.yml
Normal file
@@ -0,0 +1,229 @@
|
||||
---
|
||||
# Consolidated Traefik Diagnosis Playbook
|
||||
# Consolidates: diagnose-traefik-restarts.yml, find-traefik-restart-source.yml,
|
||||
# monitor-traefik-restarts.yml, monitor-traefik-continuously.yml,
|
||||
# verify-traefik-fix.yml
|
||||
#
|
||||
# Usage:
|
||||
# # Basic diagnosis (default)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml
|
||||
#
|
||||
# # Find restart source
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags restart-source
|
||||
#
|
||||
# # Monitor restarts
|
||||
# ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags monitor
|
||||
|
||||
- name: Diagnose Traefik Issues
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
traefik_container_name: "traefik"
|
||||
monitor_duration_seconds: "{{ monitor_duration_seconds | default(120) }}"
|
||||
monitor_lookback_hours: "{{ monitor_lookback_hours | default(24) }}"
|
||||
|
||||
tasks:
|
||||
- name: Display diagnostic plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK DIAGNOSIS
|
||||
================================================================================
|
||||
|
||||
Running diagnosis with tags: {{ ansible_run_tags | default(['all']) }}
|
||||
|
||||
Basic checks (always):
|
||||
- Container status
|
||||
- Restart count
|
||||
- Recent logs
|
||||
|
||||
Restart source (--tags restart-source):
|
||||
- Find source of restart loops
|
||||
- Check cronjobs, systemd, scripts
|
||||
|
||||
Monitor (--tags monitor):
|
||||
- Monitor for restarts over time
|
||||
|
||||
================================================================================
|
||||
|
||||
# ========================================
|
||||
# BASIC DIAGNOSIS (always runs)
|
||||
# ========================================
|
||||
- name: Check Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps {{ traefik_container_name }}
|
||||
register: traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ traefik_container_name }} --format '{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "0"
|
||||
register: traefik_restart_count
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container start time
|
||||
ansible.builtin.shell: |
|
||||
docker inspect {{ traefik_container_name }} --format '{{ '{{' }}.State.StartedAt{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: traefik_started_at
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik logs for recent restarts
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs {{ traefik_container_name }} --since 2h 2>&1 | grep -iE "stopping server gracefully|I have to go|restart|shutdown" | tail -20 || echo "No restart messages in last 2 hours"
|
||||
register: traefik_restart_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik logs for errors
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs {{ traefik_container_name }} --tail=100 2>&1 | grep -iE "error|warn|fail" | tail -20 || echo "No errors in recent logs"
|
||||
register: traefik_error_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# RESTART SOURCE DIAGNOSIS (--tags restart-source)
|
||||
# ========================================
|
||||
- name: Check all user crontabs for Traefik/Docker commands
|
||||
ansible.builtin.shell: |
|
||||
for user in $(cut -f1 -d: /etc/passwd); do
|
||||
crontab -u "$user" -l 2>/dev/null | grep -qE "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" && echo "=== User: $user ===" && crontab -u "$user" -l 2>/dev/null | grep -E "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" || true
|
||||
done || echo "No user crontabs with Traefik commands found"
|
||||
register: all_user_crontabs
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
- name: Check system-wide cron directories
|
||||
ansible.builtin.shell: |
|
||||
for dir in /etc/cron.d /etc/cron.daily /etc/cron.hourly /etc/cron.weekly /etc/cron.monthly; do
|
||||
if [ -d "$dir" ]; then
|
||||
echo "=== $dir ==="
|
||||
grep -rE "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" "$dir" 2>/dev/null || echo "No matches"
|
||||
fi
|
||||
done
|
||||
register: system_cron_dirs
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
- name: Check systemd timers and services
|
||||
ansible.builtin.shell: |
|
||||
echo "=== Active Timers ==="
|
||||
systemctl list-timers --all --no-pager | grep -E "traefik|docker.*compose" || echo "No Traefik-related timers"
|
||||
echo ""
|
||||
echo "=== Custom Services ==="
|
||||
systemctl list-units --type=service --all | grep -E "traefik|docker.*compose" || echo "No Traefik-related services"
|
||||
register: systemd_services
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
- name: Check for scripts in deployment directory that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy/deployment -type f \( -name "*.sh" -o -name "*.yml" -o -name "*.yaml" \) -exec grep -lE "traefik.*restart|docker.*compose.*traefik.*restart|docker.*compose.*traefik.*down|docker.*compose.*traefik.*stop" {} \; 2>/dev/null | head -30
|
||||
register: deployment_scripts
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
- name: Check Ansible roles for traefik_auto_restart or restart tasks
|
||||
ansible.builtin.shell: |
|
||||
grep -rE "traefik_auto_restart|traefik.*restart|docker.*compose.*traefik.*restart" /home/deploy/deployment/ansible/roles/ 2>/dev/null | grep -v ".git" | head -20 || echo "No auto-restart settings found"
|
||||
register: ansible_auto_restart
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
- name: Check Docker events for Traefik (last 24 hours)
|
||||
ansible.builtin.shell: |
|
||||
timeout 5 docker events --since 24h --filter container={{ traefik_container_name }} --filter event=die --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }}" 2>/dev/null | tail -20 || echo "No Traefik die events found"
|
||||
register: docker_events_traefik
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- restart-source
|
||||
|
||||
# ========================================
|
||||
# MONITOR (--tags monitor)
|
||||
# ========================================
|
||||
- name: Check Traefik logs for stop messages (lookback period)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs {{ traefik_container_name }} --since {{ monitor_lookback_hours }}h 2>&1 | grep -E "I have to go|Stopping server gracefully" | tail -20 || echo "No stop messages found"
|
||||
register: traefik_stop_messages
|
||||
changed_when: false
|
||||
tags:
|
||||
- monitor
|
||||
|
||||
- name: Count stop messages
|
||||
ansible.builtin.set_fact:
|
||||
stop_count: "{{ traefik_stop_messages.stdout | regex_findall('I have to go|Stopping server gracefully') | length }}"
|
||||
tags:
|
||||
- monitor
|
||||
|
||||
- name: Check system reboot history
|
||||
ansible.builtin.shell: |
|
||||
last reboot | head -5 || echo "No reboots found"
|
||||
register: reboots
|
||||
changed_when: false
|
||||
tags:
|
||||
- monitor
|
||||
|
||||
# ========================================
|
||||
# SUMMARY
|
||||
# ========================================
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK DIAGNOSIS SUMMARY
|
||||
================================================================================
|
||||
|
||||
Container Status:
|
||||
- Status: {{ traefik_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
- Restart Count: {{ traefik_restart_count.stdout }}
|
||||
- Started At: {{ traefik_started_at.stdout }}
|
||||
|
||||
Recent Logs:
|
||||
- Restart Messages (last 2h): {{ traefik_restart_logs.stdout | default('None') }}
|
||||
- Errors (last 100 lines): {{ traefik_error_logs.stdout | default('None') }}
|
||||
|
||||
{% if 'restart-source' in ansible_run_tags %}
|
||||
Restart Source Analysis:
|
||||
- User Crontabs: {{ all_user_crontabs.stdout | default('None found') }}
|
||||
- System Cron: {{ system_cron_dirs.stdout | default('None found') }}
|
||||
- Systemd Services/Timers: {{ systemd_services.stdout | default('None found') }}
|
||||
- Deployment Scripts: {{ deployment_scripts.stdout | default('None found') }}
|
||||
- Ansible Auto-Restart: {{ ansible_auto_restart.stdout | default('None found') }}
|
||||
- Docker Events: {{ docker_events_traefik.stdout | default('None found') }}
|
||||
{% endif %}
|
||||
|
||||
{% if 'monitor' in ansible_run_tags %}
|
||||
Monitoring (last {{ monitor_lookback_hours }} hours):
|
||||
- Stop Messages: {{ stop_count | default(0) }}
|
||||
- System Reboots: {{ reboots.stdout | default('None') }}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
RECOMMENDATIONS
|
||||
================================================================================
|
||||
|
||||
{% if 'stopping server gracefully' in traefik_restart_logs.stdout | lower or 'I have to go' in traefik_restart_logs.stdout %}
|
||||
❌ PROBLEM: Traefik is being stopped regularly!
|
||||
→ Run with --tags restart-source to find the source
|
||||
{% endif %}
|
||||
|
||||
{% if (traefik_restart_count.stdout | int) > 5 %}
|
||||
⚠️ WARNING: High restart count ({{ traefik_restart_count.stdout }})
|
||||
→ Check restart source: ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags restart-source
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
|
||||
@@ -1,136 +0,0 @@
|
||||
---
|
||||
# Disable Traefik Auto-Restarts
|
||||
# Deaktiviert automatische Restarts nach Config-Deployment und entfernt Cronjobs/Systemd-Timer
|
||||
- name: Disable Traefik Auto-Restarts
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check current traefik_auto_restart setting in Ansible defaults
|
||||
ansible.builtin.shell: |
|
||||
grep -r "traefik_auto_restart" /home/deploy/deployment/ansible/roles/traefik/defaults/main.yml 2>/dev/null || echo "Setting not found"
|
||||
register: current_auto_restart_setting
|
||||
changed_when: false
|
||||
|
||||
- name: Display current traefik_auto_restart setting
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Aktuelle traefik_auto_restart Einstellung:
|
||||
================================================================================
|
||||
{{ current_auto_restart_setting.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for cronjobs that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
for user in $(cut -f1 -d: /etc/passwd); do
|
||||
crontab -u "$user" -l 2>/dev/null | grep -q "traefik\|docker.*compose.*traefik.*restart" && echo "=== User: $user ===" && crontab -u "$user" -l 2>/dev/null | grep -E "traefik|docker.*compose.*traefik.*restart" || true
|
||||
done || echo "No cronjobs found that restart Traefik"
|
||||
register: traefik_cronjobs
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik cronjobs
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Cronjobs die Traefik restarten:
|
||||
================================================================================
|
||||
{{ traefik_cronjobs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for systemd timers that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /etc/systemd/system -type f -name "*.timer" 2>/dev/null | xargs grep -l "traefik\|docker.*compose.*traefik.*restart" 2>/dev/null | head -10 || echo "No systemd timers found for Traefik"
|
||||
register: traefik_timers
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik systemd timers
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Systemd Timers die Traefik restarten:
|
||||
================================================================================
|
||||
{{ traefik_timers.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for systemd services that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /etc/systemd/system -type f -name "*.service" 2>/dev/null | xargs grep -l "traefik\|docker.*compose.*traefik.*restart" 2>/dev/null | head -10 || echo "No systemd services found for Traefik"
|
||||
register: traefik_services
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik systemd services
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Systemd Services die Traefik restarten:
|
||||
================================================================================
|
||||
{{ traefik_services.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Summary - Found auto-restart mechanisms
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Gefundene Auto-Restart-Mechanismen:
|
||||
================================================================================
|
||||
|
||||
Ansible traefik_auto_restart: {{ current_auto_restart_setting.stdout }}
|
||||
|
||||
{% if traefik_cronjobs.stdout and 'No cronjobs' not in traefik_cronjobs.stdout %}
|
||||
⚠️ Gefundene Cronjobs:
|
||||
{{ traefik_cronjobs.stdout }}
|
||||
|
||||
Manuelle Deaktivierung erforderlich:
|
||||
- Entferne die Cronjob-Einträge manuell
|
||||
- Oder verwende: crontab -e
|
||||
{% endif %}
|
||||
|
||||
{% if traefik_timers.stdout and 'No systemd timers' not in traefik_timers.stdout %}
|
||||
⚠️ Gefundene Systemd Timers:
|
||||
{{ traefik_timers.stdout }}
|
||||
|
||||
Manuelle Deaktivierung erforderlich:
|
||||
- systemctl stop <timer-name>
|
||||
- systemctl disable <timer-name>
|
||||
{% endif %}
|
||||
|
||||
{% if traefik_services.stdout and 'No systemd services' not in traefik_services.stdout %}
|
||||
⚠️ Gefundene Systemd Services:
|
||||
{{ traefik_services.stdout }}
|
||||
|
||||
Manuelle Deaktivierung erforderlich:
|
||||
- systemctl stop <service-name>
|
||||
- systemctl disable <service-name>
|
||||
{% endif %}
|
||||
|
||||
{% if 'No cronjobs' in traefik_cronjobs.stdout and 'No systemd timers' in traefik_timers.stdout and 'No systemd services' in traefik_services.stdout %}
|
||||
✅ Keine automatischen Restart-Mechanismen gefunden (außer Ansible traefik_auto_restart)
|
||||
{% endif %}
|
||||
|
||||
Empfehlung:
|
||||
- Setze traefik_auto_restart: false in group_vars oder inventory
|
||||
- Oder überschreibe bei Config-Deployment: -e "traefik_auto_restart=false"
|
||||
================================================================================
|
||||
|
||||
- name: Note - Manual steps required
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
HINWEIS - Manuelle Schritte erforderlich:
|
||||
================================================================================
|
||||
|
||||
Dieses Playbook zeigt nur gefundene Auto-Restart-Mechanismen an.
|
||||
|
||||
Um traefik_auto_restart zu deaktivieren:
|
||||
|
||||
1. In group_vars/production/vars.yml oder inventory hinzufügen:
|
||||
traefik_auto_restart: false
|
||||
|
||||
2. Oder bei jedem Config-Deployment überschreiben:
|
||||
ansible-playbook ... -e "traefik_auto_restart=false"
|
||||
|
||||
3. Für Cronjobs/Systemd: Siehe oben für manuelle Deaktivierung
|
||||
|
||||
================================================================================
|
||||
@@ -1,90 +0,0 @@
|
||||
---
|
||||
# Ensure Gitea is Discovered by Traefik
|
||||
# This playbook ensures that Traefik properly discovers Gitea after restarts
|
||||
- name: Ensure Gitea is Discovered by Traefik
|
||||
hosts: production
|
||||
gather_facts: no
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
max_wait_seconds: 60
|
||||
check_interval: 5
|
||||
|
||||
tasks:
|
||||
- name: Check if Gitea container is running
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps gitea | grep -q "Up" && echo "RUNNING" || echo "NOT_RUNNING"
|
||||
register: gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Start Gitea if not running
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose up -d gitea
|
||||
when: gitea_status.stdout == "NOT_RUNNING"
|
||||
register: gitea_start
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
when: gitea_start.changed | default(false) | bool
|
||||
|
||||
- name: Check if Traefik can see Gitea container
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik sh -c 'wget -qO- http://localhost:8080/api/http/routers 2>&1 | python3 -m json.tool 2>&1 | grep -qi gitea && echo "FOUND" || echo "NOT_FOUND"'
|
||||
register: traefik_gitea_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
retries: "{{ (max_wait_seconds | int) // (check_interval | int) }}"
|
||||
delay: "{{ check_interval }}"
|
||||
until: traefik_gitea_check.stdout == "FOUND"
|
||||
|
||||
- name: Restart Traefik if Gitea not found
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose restart traefik
|
||||
when: traefik_gitea_check.stdout == "NOT_FOUND"
|
||||
register: traefik_restart
|
||||
|
||||
- name: Wait for Traefik to be ready after restart
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Verify Gitea is reachable via Traefik
|
||||
ansible.builtin.uri:
|
||||
url: "https://{{ gitea_domain }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_check
|
||||
retries: 5
|
||||
delay: 2
|
||||
until: gitea_health_check.status == 200
|
||||
failed_when: false
|
||||
|
||||
- name: Display result
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA TRAEFIK DISCOVERY - RESULT
|
||||
================================================================================
|
||||
|
||||
Gitea Status: {{ gitea_status.stdout }}
|
||||
Traefik Discovery: {{ traefik_gitea_check.stdout }}
|
||||
Gitea Health Check: {{ 'OK' if (gitea_health_check.status | default(0) == 200) else 'FAILED' }}
|
||||
|
||||
{% if gitea_health_check.status | default(0) == 200 %}
|
||||
✅ Gitea is reachable via Traefik
|
||||
{% else %}
|
||||
❌ Gitea is not reachable via Traefik
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,246 +0,0 @@
|
||||
---
|
||||
# Find Ansible Automation Source
|
||||
# Findet die Quelle der externen Ansible-Automatisierung, die Traefik regelmäßig neu startet
|
||||
- name: Find Ansible Automation Source
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
|
||||
tasks:
|
||||
- name: Check for running Ansible processes
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep -E "ansible|ansible-playbook|ansible-pull" | grep -v grep || echo "No Ansible processes found"
|
||||
register: ansible_processes
|
||||
changed_when: false
|
||||
|
||||
- name: Check for ansible-pull processes
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep ansible-pull | grep -v grep || echo "No ansible-pull processes found"
|
||||
register: ansible_pull_processes
|
||||
changed_when: false
|
||||
|
||||
- name: Check systemd timers for ansible-pull
|
||||
ansible.builtin.shell: |
|
||||
systemctl list-timers --all --no-pager | grep -i ansible || echo "No ansible timers found"
|
||||
register: ansible_timers
|
||||
changed_when: false
|
||||
|
||||
- name: Check for ansible-pull cronjobs
|
||||
ansible.builtin.shell: |
|
||||
for user in $(cut -f1 -d: /etc/passwd); do
|
||||
crontab -u "$user" -l 2>/dev/null | grep -q "ansible-pull\|ansible.*playbook" && echo "=== User: $user ===" && crontab -u "$user" -l 2>/dev/null | grep -E "ansible-pull|ansible.*playbook" || true
|
||||
done || echo "No ansible-pull cronjobs found"
|
||||
register: ansible_cronjobs
|
||||
changed_when: false
|
||||
|
||||
- name: Check system-wide cron for ansible
|
||||
ansible.builtin.shell: |
|
||||
for dir in /etc/cron.d /etc/cron.daily /etc/cron.hourly /etc/cron.weekly /etc/cron.monthly; do
|
||||
if [ -d "$dir" ]; then
|
||||
grep -rE "ansible-pull|ansible.*playbook" "$dir" 2>/dev/null && echo "=== Found in $dir ===" || true
|
||||
fi
|
||||
done || echo "No ansible in system cron"
|
||||
register: ansible_system_cron
|
||||
changed_when: false
|
||||
|
||||
- name: Check journalctl for ansible-ansible processes
|
||||
ansible.builtin.shell: |
|
||||
journalctl --since "24 hours ago" --no-pager | grep -iE "ansible-ansible|ansible-playbook|ansible-pull" | tail -50 || echo "No ansible processes in journalctl"
|
||||
register: ansible_journal
|
||||
changed_when: false
|
||||
|
||||
- name: Check for ansible-pull configuration files
|
||||
ansible.builtin.shell: |
|
||||
find /home -name "*ansible-pull*" -o -name "*ansible*.yml" -path "*/ansible-pull/*" 2>/dev/null | head -20 || echo "No ansible-pull config files found"
|
||||
register: ansible_pull_configs
|
||||
changed_when: false
|
||||
|
||||
- name: Check for running docker compose commands related to Traefik
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep -E "docker.*compose.*traefik|docker.*restart.*traefik" | grep -v grep || echo "No docker compose traefik commands running"
|
||||
register: docker_traefik_commands
|
||||
changed_when: false
|
||||
|
||||
- name: Check Docker events for Traefik kill events (last hour)
|
||||
ansible.builtin.shell: |
|
||||
docker events --since 1h --until now --filter container=traefik --filter event=die --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.signal{{ '}}' }}" 2>/dev/null | tail -20 || echo "No Traefik die events in last hour"
|
||||
register: traefik_kill_events
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check journalctl for docker compose traefik commands
|
||||
ansible.builtin.shell: |
|
||||
journalctl --since "24 hours ago" --no-pager | grep -iE "docker.*compose.*traefik|docker.*restart.*traefik" | tail -30 || echo "No docker compose traefik commands in journalctl"
|
||||
register: docker_traefik_journal
|
||||
changed_when: false
|
||||
|
||||
- name: Check for CI/CD scripts that might run Ansible
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f \( -name "*.sh" -o -name "*.yml" -o -name "*.yaml" \) -exec grep -lE "ansible.*playbook.*traefik|docker.*compose.*traefik.*restart" {} \; 2>/dev/null | head -20 || echo "No CI/CD scripts found"
|
||||
register: cicd_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Check for Gitea Workflows that run Ansible
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -path "*/.gitea/workflows/*.yml" -o -path "*/.github/workflows/*.yml" 2>/dev/null | xargs grep -lE "ansible.*playbook.*traefik|docker.*compose.*traefik" 2>/dev/null | head -10 || echo "No Gitea workflows found"
|
||||
register: gitea_workflows
|
||||
changed_when: false
|
||||
|
||||
- name: Check for monitoring/healthcheck scripts
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -name "*monitor*" -o -name "*health*" 2>/dev/null | xargs grep -lE "traefik.*restart|docker.*compose.*traefik" 2>/dev/null | head -10 || echo "No monitoring scripts found"
|
||||
register: monitoring_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ANSIBLE AUTOMATION SOURCE DIAGNOSE:
|
||||
================================================================================
|
||||
|
||||
Laufende Ansible-Prozesse:
|
||||
{{ ansible_processes.stdout }}
|
||||
|
||||
Ansible-Pull Prozesse:
|
||||
{{ ansible_pull_processes.stdout }}
|
||||
|
||||
Systemd Timers für Ansible:
|
||||
{{ ansible_timers.stdout }}
|
||||
|
||||
Cronjobs für Ansible:
|
||||
{{ ansible_cronjobs.stdout }}
|
||||
|
||||
System-Cron für Ansible:
|
||||
{{ ansible_system_cron.stdout }}
|
||||
|
||||
Ansible-Prozesse in Journalctl (letzte 24h):
|
||||
{{ ansible_journal.stdout }}
|
||||
|
||||
Ansible-Pull Konfigurationsdateien:
|
||||
{{ ansible_pull_configs.stdout }}
|
||||
|
||||
Laufende Docker Compose Traefik-Befehle:
|
||||
{{ docker_traefik_commands.stdout }}
|
||||
|
||||
Traefik Kill-Events (letzte Stunde):
|
||||
{{ traefik_kill_events.stdout }}
|
||||
|
||||
Docker Compose Traefik-Befehle in Journalctl:
|
||||
{{ docker_traefik_journal.stdout }}
|
||||
|
||||
CI/CD Scripts die Traefik restarten:
|
||||
{{ cicd_scripts.stdout }}
|
||||
|
||||
Gitea Workflows die Traefik restarten:
|
||||
{{ gitea_workflows.stdout }}
|
||||
|
||||
Monitoring-Scripts die Traefik restarten:
|
||||
{{ monitoring_scripts.stdout }}
|
||||
|
||||
================================================================================
|
||||
ANALYSE:
|
||||
================================================================================
|
||||
|
||||
{% if 'No Ansible processes found' not in ansible_processes.stdout %}
|
||||
⚠️ AKTIVE ANSIBLE-PROZESSE GEFUNDEN:
|
||||
{{ ansible_processes.stdout }}
|
||||
|
||||
→ Diese Prozesse könnten Traefik regelmäßig neu starten
|
||||
→ Prüfe die Kommandozeile dieser Prozesse um das Playbook zu identifizieren
|
||||
{% endif %}
|
||||
|
||||
{% if 'No ansible-pull processes found' not in ansible_pull_processes.stdout %}
|
||||
❌ ANSIBLE-PULL LÄUFT:
|
||||
{{ ansible_pull_processes.stdout }}
|
||||
|
||||
→ ansible-pull führt regelmäßig Playbooks aus
|
||||
→ Dies ist wahrscheinlich die Quelle der Traefik-Restarts
|
||||
{% endif %}
|
||||
|
||||
{% if 'No ansible timers found' not in ansible_timers.stdout %}
|
||||
❌ ANSIBLE TIMER GEFUNDEN:
|
||||
{{ ansible_timers.stdout }}
|
||||
|
||||
→ Ein Systemd-Timer führt regelmäßig Ansible aus
|
||||
→ Deaktiviere mit: systemctl disable <timer-name>
|
||||
{% endif %}
|
||||
|
||||
{% if 'No ansible-pull cronjobs found' not in ansible_cronjobs.stdout %}
|
||||
❌ ANSIBLE CRONJOB GEFUNDEN:
|
||||
{{ ansible_cronjobs.stdout }}
|
||||
|
||||
→ Ein Cronjob führt regelmäßig Ansible aus
|
||||
→ Entferne oder kommentiere den Cronjob-Eintrag
|
||||
{% endif %}
|
||||
|
||||
{% if cicd_scripts.stdout and 'No CI/CD scripts found' not in cicd_scripts.stdout %}
|
||||
⚠️ CI/CD SCRIPTS GEFUNDEN:
|
||||
{{ cicd_scripts.stdout }}
|
||||
|
||||
→ Diese Scripts könnten Traefik regelmäßig neu starten
|
||||
→ Prüfe diese Dateien und entferne/kommentiere Traefik-Restart-Befehle
|
||||
{% endif %}
|
||||
|
||||
{% if gitea_workflows.stdout and 'No Gitea workflows found' not in gitea_workflows.stdout %}
|
||||
⚠️ GITEA WORKFLOWS GEFUNDEN:
|
||||
{{ gitea_workflows.stdout }}
|
||||
|
||||
→ Diese Workflows könnten Traefik regelmäßig neu starten
|
||||
→ Prüfe diese Workflows und entferne/kommentiere Traefik-Restart-Schritte
|
||||
{% endif %}
|
||||
|
||||
{% if monitoring_scripts.stdout and 'No monitoring scripts found' not in monitoring_scripts.stdout %}
|
||||
⚠️ MONITORING SCRIPTS GEFUNDEN:
|
||||
{{ monitoring_scripts.stdout }}
|
||||
|
||||
→ Diese Scripts könnten Traefik regelmäßig neu starten
|
||||
→ Prüfe diese Scripts und entferne/kommentiere Traefik-Restart-Befehle
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
LÖSUNG:
|
||||
================================================================================
|
||||
|
||||
{% if 'No Ansible processes found' in ansible_processes.stdout and 'No ansible-pull processes found' in ansible_pull_processes.stdout and 'No ansible timers found' in ansible_timers.stdout and 'No ansible-pull cronjobs found' in ansible_cronjobs.stdout %}
|
||||
ℹ️ Keine aktiven Ansible-Automatisierungen gefunden
|
||||
|
||||
Mögliche Ursachen:
|
||||
1. Ansible-Prozesse laufen nur zeitweise (intermittierend)
|
||||
2. Externe CI/CD-Pipeline führt Ansible aus
|
||||
3. Manuelle Ansible-Aufrufe von außen
|
||||
|
||||
Nächste Schritte:
|
||||
1. Beobachte Docker Events in Echtzeit: docker events --filter container=traefik
|
||||
2. Beobachte Ansible-Prozesse: watch -n 1 'ps aux | grep ansible'
|
||||
3. Prüfe ob externe CI/CD-Pipelines Ansible ausführen
|
||||
{% else %}
|
||||
|
||||
SOFORTMASSNAHME:
|
||||
|
||||
{% if 'No ansible-pull processes found' not in ansible_pull_processes.stdout %}
|
||||
1. ❌ Stoppe ansible-pull:
|
||||
pkill -f ansible-pull
|
||||
{% endif %}
|
||||
|
||||
{% if 'No ansible timers found' not in ansible_timers.stdout %}
|
||||
2. ❌ Deaktiviere Ansible-Timer:
|
||||
systemctl stop <timer-name>
|
||||
systemctl disable <timer-name>
|
||||
{% endif %}
|
||||
|
||||
{% if 'No ansible-pull cronjobs found' not in ansible_cronjobs.stdout %}
|
||||
3. ❌ Entferne Ansible-Cronjobs:
|
||||
crontab -u <user> -e
|
||||
(Kommentiere oder entferne die Ansible-Zeilen)
|
||||
{% endif %}
|
||||
|
||||
LANGZEITLÖSUNG:
|
||||
|
||||
1. Prüfe gefundene Scripts/Workflows und entferne Traefik-Restart-Befehle
|
||||
2. Falls Healthchecks nötig sind, setze größere Intervalle (z.B. 5 Minuten statt 30 Sekunden)
|
||||
3. Restarte Traefik nur bei echten Fehlern, nicht präventiv
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,328 +0,0 @@
|
||||
---
|
||||
# Find Source of Traefik Restarts
|
||||
# Umfassende Diagnose um die Quelle der regelmäßigen Traefik-Restarts zu finden
|
||||
- name: Find Source of Traefik Restarts
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: yes
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
monitor_duration_seconds: 120 # 2 Minuten Monitoring (kann erhöht werden)
|
||||
|
||||
tasks:
|
||||
- name: Check Traefik container restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "0"
|
||||
register: traefik_restart_count
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container start time
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.StartedAt{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: traefik_started_at
|
||||
changed_when: false
|
||||
|
||||
- name: Analyze Traefik logs for "Stopping server gracefully" messages
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik 2>&1 | grep -i "stopping server gracefully\|I have to go" | tail -20
|
||||
register: traefik_stop_messages
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Extract timestamps from stop messages
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik 2>&1 | grep -i "stopping server gracefully\|I have to go" | tail -20 | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}' | sort | uniq
|
||||
register: stop_timestamps
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Docker daemon logs for Traefik stop events
|
||||
ansible.builtin.shell: |
|
||||
journalctl -u docker.service --since "24 hours ago" --no-pager | grep -iE "traefik.*stop|traefik.*kill|traefik.*die|container.*traefik.*stopped" | tail -30 || echo "No Traefik stop events in Docker daemon logs"
|
||||
register: docker_daemon_logs
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Docker events for Traefik (last 24 hours)
|
||||
ansible.builtin.shell: |
|
||||
docker events --since 24h --until now --filter container=traefik --filter event=die --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.name{{ '}}' }}" 2>/dev/null | tail -20 || echo "No Traefik die events found"
|
||||
register: docker_events_traefik
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check all user crontabs for Traefik/Docker commands
|
||||
ansible.builtin.shell: |
|
||||
for user in $(cut -f1 -d: /etc/passwd); do
|
||||
crontab -u "$user" -l 2>/dev/null | grep -qE "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" && echo "=== User: $user ===" && crontab -u "$user" -l 2>/dev/null | grep -E "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" || true
|
||||
done || echo "No user crontabs with Traefik commands found"
|
||||
register: all_user_crontabs
|
||||
changed_when: false
|
||||
|
||||
- name: Check system-wide cron directories
|
||||
ansible.builtin.shell: |
|
||||
for dir in /etc/cron.d /etc/cron.daily /etc/cron.hourly /etc/cron.weekly /etc/cron.monthly; do
|
||||
if [ -d "$dir" ]; then
|
||||
echo "=== $dir ==="
|
||||
grep -rE "traefik|docker.*compose.*traefik|docker.*stop.*traefik|docker.*restart.*traefik|docker.*down.*traefik" "$dir" 2>/dev/null || echo "No matches"
|
||||
fi
|
||||
done
|
||||
register: system_cron_dirs
|
||||
changed_when: false
|
||||
|
||||
- name: Check systemd timers and services
|
||||
ansible.builtin.shell: |
|
||||
echo "=== Active Timers ==="
|
||||
systemctl list-timers --all --no-pager | grep -E "traefik|docker.*compose" || echo "No Traefik-related timers"
|
||||
echo ""
|
||||
echo "=== Custom Services ==="
|
||||
systemctl list-units --type=service --all | grep -E "traefik|docker.*compose" || echo "No Traefik-related services"
|
||||
register: systemd_services
|
||||
changed_when: false
|
||||
|
||||
- name: Check for scripts in deployment directory that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy/deployment -type f \( -name "*.sh" -o -name "*.yml" -o -name "*.yaml" \) -exec grep -lE "traefik.*restart|docker.*compose.*traefik.*restart|docker.*compose.*traefik.*down|docker.*compose.*traefik.*stop" {} \; 2>/dev/null | head -30
|
||||
register: deployment_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Check Ansible roles for traefik_auto_restart or restart tasks
|
||||
ansible.builtin.shell: |
|
||||
grep -rE "traefik_auto_restart|traefik.*restart|docker.*compose.*traefik.*restart" /home/deploy/deployment/ansible/roles/ 2>/dev/null | grep -v ".git" | head -20 || echo "No auto-restart settings found"
|
||||
register: ansible_auto_restart
|
||||
changed_when: false
|
||||
|
||||
- name: Check Docker Compose watch mode
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps traefik 2>/dev/null | grep -q "traefik" && echo "running" || echo "not_running"
|
||||
register: docker_compose_watch
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check if Docker Compose is running in watch mode
|
||||
ansible.builtin.shell: |
|
||||
ps aux | grep -E "docker.*compose.*watch|docker.*compose.*--watch" | grep -v grep || echo "No Docker Compose watch mode detected"
|
||||
register: watch_mode_process
|
||||
changed_when: false
|
||||
|
||||
- name: Check for monitoring/watchdog scripts
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -name "*monitor*" -o -name "*watchdog*" -o -name "*health*" 2>/dev/null | xargs grep -lE "traefik|docker.*compose.*traefik" 2>/dev/null | head -10 || echo "No monitoring scripts found"
|
||||
register: monitoring_scripts
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea Workflows for Traefik restarts
|
||||
ansible.builtin.shell: |
|
||||
find /home/deploy -type f -path "*/.gitea/workflows/*.yml" -o -path "*/.github/workflows/*.yml" 2>/dev/null | xargs grep -lE "traefik.*restart|docker.*compose.*traefik.*restart" 2>/dev/null | head -10 || echo "No Gitea workflows found that restart Traefik"
|
||||
register: gitea_workflows
|
||||
changed_when: false
|
||||
|
||||
- name: Monitor Docker events in real-time (5 minutes)
|
||||
ansible.builtin.shell: |
|
||||
timeout {{ monitor_duration_seconds }} docker events --filter container=traefik --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.name{{ '}}' }}" 2>&1 || echo "Monitoring completed or timeout"
|
||||
register: docker_events_realtime
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
async: "{{ monitor_duration_seconds + 10 }}"
|
||||
poll: 0
|
||||
|
||||
- name: Wait for monitoring to complete
|
||||
ansible.builtin.async_status:
|
||||
jid: "{{ docker_events_realtime.ansible_job_id }}"
|
||||
register: monitoring_result
|
||||
until: monitoring_result.finished
|
||||
retries: "{{ (monitor_duration_seconds / 10) | int + 5 }}"
|
||||
delay: 10
|
||||
failed_when: false
|
||||
|
||||
- name: Check system reboot history
|
||||
ansible.builtin.shell: |
|
||||
last reboot --since "24 hours ago" 2>/dev/null | head -10 || echo "No reboots in last 24 hours"
|
||||
register: reboot_history
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check for at jobs
|
||||
ansible.builtin.shell: |
|
||||
atq 2>/dev/null | while read line; do
|
||||
job_id=$(echo "$line" | awk '{print $1}')
|
||||
at -c "$job_id" 2>/dev/null | grep -qE "traefik|docker.*compose.*traefik" && echo "=== Job ID: $job_id ===" && at -c "$job_id" 2>/dev/null | grep -E "traefik|docker.*compose.*traefik" || true
|
||||
done || echo "No at jobs found or atq not available"
|
||||
register: at_jobs
|
||||
changed_when: false
|
||||
|
||||
- name: Check Docker daemon configuration for auto-restart
|
||||
ansible.builtin.shell: |
|
||||
cat /etc/docker/daemon.json 2>/dev/null | grep -iE "restart|live-restore" || echo "No restart settings in daemon.json"
|
||||
register: docker_daemon_config
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check if Traefik has restart policy
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose config | grep -A 5 "traefik:" | grep -E "restart|restart_policy" || echo "No explicit restart policy found"
|
||||
register: traefik_restart_policy
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK RESTART SOURCE DIAGNOSE - ZUSAMMENFASSUNG:
|
||||
================================================================================
|
||||
|
||||
Traefik Status:
|
||||
- Restart Count: {{ traefik_restart_count.stdout }}
|
||||
- Started At: {{ traefik_started_at.stdout }}
|
||||
- Stop Messages gefunden: {{ traefik_stop_messages.stdout_lines | length }} (letzte 20)
|
||||
|
||||
Stop-Zeitstempel (letzte 20):
|
||||
{% if stop_timestamps.stdout %}
|
||||
{{ stop_timestamps.stdout }}
|
||||
{% else %}
|
||||
Keine Stop-Zeitstempel gefunden
|
||||
{% endif %}
|
||||
|
||||
Docker Events (letzte 24h):
|
||||
{% if docker_events_traefik.stdout and 'No Traefik die events' not in docker_events_traefik.stdout %}
|
||||
{{ docker_events_traefik.stdout }}
|
||||
{% else %}
|
||||
Keine Traefik die-Events in den letzten 24 Stunden
|
||||
{% endif %}
|
||||
|
||||
Docker Daemon Logs:
|
||||
{% if docker_daemon_logs.stdout and 'No Traefik stop events' not in docker_daemon_logs.stdout %}
|
||||
{{ docker_daemon_logs.stdout }}
|
||||
{% else %}
|
||||
Keine Traefik-Stop-Events in Docker-Daemon-Logs
|
||||
{% endif %}
|
||||
|
||||
Gefundene Quellen:
|
||||
{% if all_user_crontabs.stdout and 'No user crontabs' not in all_user_crontabs.stdout %}
|
||||
1. ❌ CRONJOBS (User):
|
||||
{{ all_user_crontabs.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if system_cron_dirs.stdout and 'No matches' not in system_cron_dirs.stdout %}
|
||||
2. ❌ SYSTEM CRON:
|
||||
{{ system_cron_dirs.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if systemd_services.stdout and 'No Traefik-related' not in systemd_services.stdout %}
|
||||
3. ❌ SYSTEMD TIMERS/SERVICES:
|
||||
{{ systemd_services.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if deployment_scripts.stdout and 'No' not in deployment_scripts.stdout %}
|
||||
4. ⚠️ DEPLOYMENT SCRIPTS:
|
||||
{{ deployment_scripts.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if ansible_auto_restart.stdout and 'No auto-restart' not in ansible_auto_restart.stdout %}
|
||||
5. ⚠️ ANSIBLE AUTO-RESTART:
|
||||
{{ ansible_auto_restart.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if gitea_workflows.stdout and 'No Gitea workflows' not in gitea_workflows.stdout %}
|
||||
6. ⚠️ GITEA WORKFLOWS:
|
||||
{{ gitea_workflows.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if monitoring_scripts.stdout and 'No monitoring scripts' not in monitoring_scripts.stdout %}
|
||||
7. ⚠️ MONITORING SCRIPTS:
|
||||
{{ monitoring_scripts.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if at_jobs.stdout and 'No at jobs' not in at_jobs.stdout %}
|
||||
8. ❌ AT JOBS:
|
||||
{{ at_jobs.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if docker_compose_watch.stdout and 'Could not check' not in docker_compose_watch.stdout %}
|
||||
9. ⚠️ DOCKER COMPOSE WATCH:
|
||||
{{ docker_compose_watch.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if watch_mode_process.stdout and 'No Docker Compose watch' not in watch_mode_process.stdout %}
|
||||
10. ❌ DOCKER COMPOSE WATCH MODE (PROZESS):
|
||||
{{ watch_mode_process.stdout }}
|
||||
{% endif %}
|
||||
|
||||
{% if reboot_history.stdout and 'No reboots' not in reboot_history.stdout %}
|
||||
11. ⚠️ SYSTEM REBOOTS:
|
||||
{{ reboot_history.stdout }}
|
||||
{% endif %}
|
||||
|
||||
Real-Time Monitoring ({{ monitor_duration_seconds }} Sekunden):
|
||||
{% if monitoring_result.finished and monitoring_result.ansible_job_id %}
|
||||
{{ monitoring_result.stdout | default('Keine Events während Monitoring') }}
|
||||
{% else %}
|
||||
Monitoring läuft noch oder wurde unterbrochen
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
NÄCHSTE SCHRITTE:
|
||||
================================================================================
|
||||
|
||||
{% if all_user_crontabs.stdout and 'No user crontabs' not in all_user_crontabs.stdout %}
|
||||
1. ❌ CRONJOBS DEAKTIVIEREN:
|
||||
- Prüfe gefundene Cronjobs: {{ all_user_crontabs.stdout }}
|
||||
- Entferne oder kommentiere die entsprechenden Einträge
|
||||
{% endif %}
|
||||
|
||||
{% if system_cron_dirs.stdout and 'No matches' not in system_cron_dirs.stdout %}
|
||||
2. ❌ SYSTEM CRON DEAKTIVIEREN:
|
||||
- Prüfe gefundene System-Cronjobs: {{ system_cron_dirs.stdout }}
|
||||
- Entferne oder benenne die Dateien um
|
||||
{% endif %}
|
||||
|
||||
{% if systemd_services.stdout and 'No Traefik-related' not in systemd_services.stdout %}
|
||||
3. ❌ SYSTEMD TIMERS/SERVICES DEAKTIVIEREN:
|
||||
- Prüfe gefundene Services/Timers: {{ systemd_services.stdout }}
|
||||
- Deaktiviere mit: systemctl disable <service>
|
||||
{% endif %}
|
||||
|
||||
{% if deployment_scripts.stdout and 'No' not in deployment_scripts.stdout %}
|
||||
4. ⚠️ DEPLOYMENT SCRIPTS PRÜFEN:
|
||||
- Prüfe gefundene Scripts: {{ deployment_scripts.stdout }}
|
||||
- Entferne oder kommentiere Traefik-Restart-Befehle
|
||||
{% endif %}
|
||||
|
||||
{% if ansible_auto_restart.stdout and 'No auto-restart' not in ansible_auto_restart.stdout %}
|
||||
5. ⚠️ ANSIBLE AUTO-RESTART PRÜFEN:
|
||||
- Prüfe gefundene Einstellungen: {{ ansible_auto_restart.stdout }}
|
||||
- Setze traefik_auto_restart: false in group_vars
|
||||
{% endif %}
|
||||
|
||||
{% if not all_user_crontabs.stdout or 'No user crontabs' in all_user_crontabs.stdout %}
|
||||
{% if not system_cron_dirs.stdout or 'No matches' in system_cron_dirs.stdout %}
|
||||
{% if not systemd_services.stdout or 'No Traefik-related' in systemd_services.stdout %}
|
||||
{% if not deployment_scripts.stdout or 'No' in deployment_scripts.stdout %}
|
||||
{% if not ansible_auto_restart.stdout or 'No auto-restart' in ansible_auto_restart.stdout %}
|
||||
|
||||
⚠️ KEINE AUTOMATISCHEN RESTART-MECHANISMEN GEFUNDEN!
|
||||
|
||||
Mögliche Ursachen:
|
||||
1. Externer Prozess (nicht über Cron/Systemd)
|
||||
2. Docker-Service-Restarts (systemctl restart docker)
|
||||
3. Host-Reboots
|
||||
4. Manuelle Restarts (von außen)
|
||||
5. Monitoring-Service (Portainer, Watchtower, etc.)
|
||||
|
||||
Nächste Schritte:
|
||||
1. Führe 'docker events --filter container=traefik' manuell aus und beobachte
|
||||
2. Prüfe journalctl -u docker.service für Docker-Service-Restarts
|
||||
3. Prüfe ob Portainer oder andere Monitoring-Tools laufen
|
||||
4. Prüfe ob Watchtower oder andere Auto-Update-Tools installiert sind
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,175 +0,0 @@
|
||||
---
|
||||
# Fix Gitea Complete - Deaktiviert Runner, repariert Service Discovery
|
||||
# Behebt Gitea-Timeouts durch: 1) Runner deaktivieren, 2) Service Discovery reparieren
|
||||
- name: Fix Gitea Complete
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_runner_path: "{{ stacks_base_path }}/../gitea-runner"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Check Gitea Runner status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_runner_path }}
|
||||
docker compose ps gitea-runner 2>/dev/null || echo "Runner not found"
|
||||
register: runner_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea Runner status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Runner Status (Before):
|
||||
================================================================================
|
||||
{{ runner_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Stop Gitea Runner to reduce load
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_runner_path }}
|
||||
docker compose stop gitea-runner
|
||||
register: runner_stop
|
||||
changed_when: runner_stop.rc == 0
|
||||
failed_when: false
|
||||
when: runner_status.rc == 0
|
||||
|
||||
- name: Check Gitea container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps gitea
|
||||
register: gitea_status_before
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps traefik
|
||||
register: traefik_status_before
|
||||
changed_when: false
|
||||
|
||||
- name: Restart Gitea container
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose restart gitea
|
||||
register: gitea_restart
|
||||
changed_when: gitea_restart.rc == 0
|
||||
|
||||
- name: Wait for Gitea to be ready (direct check)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
for i in {1..30}; do
|
||||
if docker compose exec -T gitea curl -f http://localhost:3000/api/healthz >/dev/null 2>&1; then
|
||||
echo "Gitea is ready"
|
||||
exit 0
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
echo "Gitea not ready after 60 seconds"
|
||||
exit 1
|
||||
register: gitea_ready
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Restart Traefik to refresh service discovery
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose restart traefik
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
when: traefik_auto_restart | default(false) | bool
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Wait for Gitea to be reachable via Traefik (with retries)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_via_traefik
|
||||
until: gitea_health_via_traefik.status == 200
|
||||
retries: 15
|
||||
delay: 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: (traefik_restart.changed | default(false) | bool) or (gitea_restart.changed | default(false) | bool)
|
||||
|
||||
- name: Check if Gitea is in Traefik service discovery
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik traefik show providers docker 2>/dev/null | grep -i "gitea" || echo "NOT_FOUND"
|
||||
register: traefik_gitea_service_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: (traefik_restart.changed | default(false) | bool) or (gitea_restart.changed | default(false) | bool)
|
||||
|
||||
- name: Final status check
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Gitea Complete Fix:
|
||||
================================================================================
|
||||
|
||||
Aktionen:
|
||||
- Gitea Runner: {% if runner_stop.changed | default(false) %}✅ Gestoppt{% else %}ℹ️ War nicht aktiv oder nicht gefunden{% endif %}
|
||||
- Gitea Restart: {% if gitea_restart.changed %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
- Traefik Restart: {% if traefik_restart.changed %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
|
||||
Gitea Ready Check:
|
||||
- Direkt: {% if gitea_ready.rc == 0 %}✅ Bereit{% else %}❌ Nicht bereit{% endif %}
|
||||
|
||||
Final Status:
|
||||
- Gitea via Traefik: {% if final_status.status == 200 %}✅ Erreichbar (Status: 200){% else %}❌ Nicht erreichbar (Status: {{ final_status.status | default('TIMEOUT') }}){% endif %}
|
||||
- Traefik Service Discovery: {% if 'NOT_FOUND' not in traefik_gitea_service_check.stdout %}✅ Gitea gefunden{% else %}❌ Gitea nicht gefunden{% endif %}
|
||||
|
||||
{% if final_status.status == 200 and 'NOT_FOUND' not in traefik_gitea_service_check.stdout %}
|
||||
✅ ERFOLG: Gitea ist jetzt über Traefik erreichbar!
|
||||
URL: {{ gitea_url }}
|
||||
|
||||
Nächste Schritte:
|
||||
1. Teste Gitea im Browser: {{ gitea_url }}
|
||||
2. Wenn alles stabil läuft, kannst du den Runner wieder aktivieren:
|
||||
cd {{ gitea_runner_path }} && docker compose up -d gitea-runner
|
||||
3. Beobachte ob der Runner Gitea wieder überlastet
|
||||
{% else %}
|
||||
⚠️ PROBLEM: Gitea ist noch nicht vollständig erreichbar
|
||||
|
||||
Mögliche Ursachen:
|
||||
{% if final_status.status != 200 %}
|
||||
- Gitea antwortet nicht via Traefik (Status: {{ final_status.status | default('TIMEOUT') }})
|
||||
{% endif %}
|
||||
{% if 'NOT_FOUND' in traefik_gitea_service_check.stdout %}
|
||||
- Traefik Service Discovery hat Gitea noch nicht erkannt
|
||||
{% endif %}
|
||||
|
||||
Nächste Schritte:
|
||||
1. Warte 1-2 Minuten und teste erneut: curl -k {{ gitea_url }}/api/healthz
|
||||
2. Prüfe Traefik-Logs: cd {{ traefik_stack_path }} && docker compose logs traefik --tail=50
|
||||
3. Prüfe Gitea-Logs: cd {{ gitea_stack_path }} && docker compose logs gitea --tail=50
|
||||
4. Prüfe Service Discovery: cd {{ traefik_stack_path }} && docker compose exec -T traefik traefik show providers docker
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,195 +0,0 @@
|
||||
---
|
||||
# Fix Gitea SSL and Routing Issues
|
||||
# Prüft SSL-Zertifikat, Service Discovery und behebt Routing-Probleme
|
||||
- name: Fix Gitea SSL and Routing
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
gitea_url_http: "http://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Check Gitea container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps gitea
|
||||
register: gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps traefik
|
||||
register: traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check if Gitea is in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker network inspect traefik-public --format '{{ '{{' }}range .Containers{{ '}}' }}{{ '{{' }}.Name{{ '}}' }} {{ '{{' }}end{{ '}}' }}' 2>/dev/null | grep -q gitea && echo "YES" || echo "NO"
|
||||
register: gitea_in_network
|
||||
changed_when: false
|
||||
|
||||
- name: Test direct connection from Traefik to Gitea (by service name)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik wget -qO- --timeout=5 http://gitea:3000/api/healthz 2>&1 || echo "CONNECTION_FAILED"
|
||||
register: traefik_gitea_direct
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Traefik logs for SSL/ACME errors
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik --tail=100 2>&1 | grep -iE "acme|certificate|git\.michaelschiemer\.de|ssl|tls" | tail -20 || echo "No SSL/ACME errors found"
|
||||
register: traefik_ssl_errors
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check if SSL certificate exists for git.michaelschiemer.de
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T traefik cat /acme.json 2>/dev/null | grep -q "git.michaelschiemer.de" && echo "YES" || echo "NO"
|
||||
register: ssl_cert_exists
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test Gitea via HTTP (port 80, should redirect or show error)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url_http }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200, 301, 302, 404, 502, 503, 504]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_http_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test Gitea via HTTPS
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200, 301, 302, 404, 502, 503, 504]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_https_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display diagnostic information
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA SSL/ROUTING DIAGNOSE:
|
||||
================================================================================
|
||||
|
||||
Container Status:
|
||||
- Gitea: {{ gitea_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
- Traefik: {{ traefik_status.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
|
||||
Network:
|
||||
- Gitea in traefik-public: {% if gitea_in_network.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
- Traefik → Gitea (direct): {% if 'CONNECTION_FAILED' not in traefik_gitea_direct.stdout %}✅{% else %}❌{% endif %}
|
||||
|
||||
SSL/Certificate:
|
||||
- Certificate in acme.json: {% if ssl_cert_exists.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
|
||||
Connectivity:
|
||||
- HTTP (port 80): Status {{ gitea_http_test.status | default('TIMEOUT') }}
|
||||
- HTTPS (port 443): Status {{ gitea_https_test.status | default('TIMEOUT') }}
|
||||
|
||||
Traefik SSL/ACME Errors:
|
||||
{{ traefik_ssl_errors.stdout }}
|
||||
|
||||
================================================================================
|
||||
|
||||
- name: Restart Gitea if not in network or connection failed
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose restart gitea
|
||||
register: gitea_restart
|
||||
changed_when: gitea_restart.rc == 0
|
||||
when: gitea_in_network.stdout != 'YES' or 'CONNECTION_FAILED' in traefik_gitea_direct.stdout
|
||||
|
||||
- name: Wait for Gitea to be ready after restart
|
||||
ansible.builtin.pause:
|
||||
seconds: 30
|
||||
when: gitea_restart.changed | default(false)
|
||||
|
||||
- name: Restart Traefik to refresh service discovery and SSL
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose restart traefik
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
when: >
|
||||
(traefik_auto_restart | default(false) | bool)
|
||||
and (gitea_restart.changed | default(false) or gitea_https_test.status | default(0) != 200)
|
||||
|
||||
- name: Wait for Traefik to be ready after restart
|
||||
ansible.builtin.pause:
|
||||
seconds: 15
|
||||
when: traefik_restart.changed | default(false)
|
||||
|
||||
- name: Wait for Gitea to be reachable via HTTPS (with retries)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_gitea_test
|
||||
until: final_gitea_test.status == 200
|
||||
retries: 20
|
||||
delay: 3
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: traefik_restart.changed | default(false) or gitea_restart.changed | default(false)
|
||||
|
||||
- name: Final status check
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Gitea SSL/Routing Fix:
|
||||
================================================================================
|
||||
|
||||
Aktionen:
|
||||
- Gitea Restart: {% if gitea_restart.changed | default(false) %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
- Traefik Restart: {% if traefik_restart.changed | default(false) %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
|
||||
Final Status:
|
||||
- Gitea via HTTPS: {% if final_status.status == 200 %}✅ Erreichbar{% else %}❌ Nicht erreichbar (Status: {{ final_status.status | default('TIMEOUT') }}){% endif %}
|
||||
|
||||
{% if final_status.status == 200 %}
|
||||
✅ Gitea ist jetzt über Traefik erreichbar!
|
||||
URL: {{ gitea_url }}
|
||||
{% else %}
|
||||
⚠️ Gitea ist noch nicht erreichbar
|
||||
|
||||
Mögliche Ursachen:
|
||||
1. SSL-Zertifikat wird noch generiert (ACME Challenge läuft)
|
||||
2. Traefik Service Discovery braucht mehr Zeit
|
||||
3. Netzwerk-Problem zwischen Traefik und Gitea
|
||||
|
||||
Nächste Schritte:
|
||||
1. Warte 2-5 Minuten und teste erneut: curl -k {{ gitea_url }}/api/healthz
|
||||
2. Prüfe Traefik-Logs: cd {{ traefik_stack_path }} && docker compose logs traefik --tail=50
|
||||
3. Prüfe Gitea-Logs: cd {{ gitea_stack_path }} && docker compose logs gitea --tail=50
|
||||
4. Prüfe Netzwerk: docker network inspect traefik-public | grep -A 5 gitea
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,159 +0,0 @@
|
||||
---
|
||||
# Fix Gitea Timeouts
|
||||
# Startet Gitea und Traefik neu, um Timeout-Probleme zu beheben
|
||||
- name: Fix Gitea Timeouts
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
|
||||
tasks:
|
||||
- name: Check Gitea container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose ps gitea
|
||||
register: gitea_status_before
|
||||
changed_when: false
|
||||
|
||||
- name: Display Gitea status before restart
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Status (Before Restart):
|
||||
================================================================================
|
||||
{{ gitea_status_before.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose ps traefik
|
||||
register: traefik_status_before
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik status before restart
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Status (Before Restart):
|
||||
================================================================================
|
||||
{{ traefik_status_before.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Restart Gitea container
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose restart gitea
|
||||
register: gitea_restart
|
||||
changed_when: gitea_restart.rc == 0
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
ansible.builtin.uri:
|
||||
url: "https://git.michaelschiemer.de/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_after_restart
|
||||
until: gitea_health_after_restart.status == 200
|
||||
retries: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea health after restart
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Gitea Health After Restart:
|
||||
================================================================================
|
||||
{% if gitea_health_after_restart.status == 200 %}
|
||||
✅ Gitea is healthy after restart
|
||||
{% else %}
|
||||
⚠️ Gitea health check failed after restart (Status: {{ gitea_health_after_restart.status | default('TIMEOUT') }})
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Restart Traefik to refresh service discovery
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose restart traefik
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
when: traefik_auto_restart | default(false) | bool
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Wait for Gitea to be reachable via Traefik
|
||||
ansible.builtin.uri:
|
||||
url: "https://git.michaelschiemer.de/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_via_traefik
|
||||
until: gitea_health_via_traefik.status == 200
|
||||
retries: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: (traefik_restart.changed | default(false) | bool) or (gitea_restart.changed | default(false) | bool)
|
||||
|
||||
- name: Check final Gitea container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose ps gitea
|
||||
register: gitea_status_after
|
||||
changed_when: false
|
||||
|
||||
- name: Check final Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose ps traefik
|
||||
register: traefik_status_after
|
||||
changed_when: false
|
||||
|
||||
- name: Test Gitea access via Traefik
|
||||
ansible.builtin.uri:
|
||||
url: "https://git.michaelschiemer.de/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_gitea_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Gitea Timeout Fix:
|
||||
================================================================================
|
||||
|
||||
Gitea Restart: {% if gitea_restart.changed %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
Traefik Restart: {% if traefik_restart.changed %}✅ Durchgeführt{% else %}ℹ️ Nicht nötig{% endif %}
|
||||
|
||||
Final Status:
|
||||
- Gitea: {{ gitea_status_after.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
- Traefik: {{ traefik_status_after.stdout | regex_replace('.*(Up|Down|Restarting).*', '\\1') | default('UNKNOWN') }}
|
||||
- Gitea via Traefik: {% if final_gitea_test.status == 200 %}✅ Erreichbar{% else %}❌ Nicht erreichbar (Status: {{ final_gitea_test.status | default('TIMEOUT') }}){% endif %}
|
||||
|
||||
{% if final_gitea_test.status == 200 %}
|
||||
✅ Gitea ist jetzt über Traefik erreichbar!
|
||||
URL: https://git.michaelschiemer.de
|
||||
{% else %}
|
||||
⚠️ Gitea ist noch nicht über Traefik erreichbar
|
||||
|
||||
Nächste Schritte:
|
||||
1. Prüfe Gitea-Logs: cd /home/deploy/deployment/stacks/gitea && docker compose logs gitea --tail=50
|
||||
2. Prüfe Traefik-Logs: cd /home/deploy/deployment/stacks/traefik && docker compose logs traefik --tail=50
|
||||
3. Prüfe Netzwerk: docker network inspect traefik-public | grep -A 5 gitea
|
||||
4. Führe diagnose-gitea-timeouts.yml aus für detaillierte Diagnose
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
@@ -1,94 +0,0 @@
|
||||
---
|
||||
# Ansible Playbook: Fix Gitea-Traefik Connection Issues
|
||||
# Purpose: Ensure Traefik can reliably reach Gitea by restarting both services
|
||||
# Usage:
|
||||
# ansible-playbook -i inventory/production.yml playbooks/fix-gitea-traefik-connection.yml \
|
||||
# --vault-password-file secrets/.vault_pass
|
||||
|
||||
- name: Fix Gitea-Traefik Connection
|
||||
hosts: production
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Get current Gitea container IP
|
||||
shell: |
|
||||
docker inspect gitea | grep -A 10 'traefik-public' | grep IPAddress | head -1 | awk '{print $2}' | tr -d '",'
|
||||
register: gitea_ip
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Gitea IP
|
||||
debug:
|
||||
msg: "Gitea container IP in traefik-public network: {{ gitea_ip.stdout }}"
|
||||
|
||||
- name: Test direct connection to Gitea from Traefik container
|
||||
shell: |
|
||||
docker compose -f {{ traefik_stack_path }}/docker-compose.yml exec -T traefik wget -qO- http://{{ gitea_ip.stdout }}:3000/api/healthz 2>&1 | head -3
|
||||
register: traefik_gitea_test
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Traefik-Gitea connection test result
|
||||
debug:
|
||||
msg: "{{ traefik_gitea_test.stdout }}"
|
||||
|
||||
- name: Restart Gitea container to refresh IP
|
||||
shell: |
|
||||
docker compose -f {{ gitea_stack_path }}/docker-compose.yml restart gitea
|
||||
when: traefik_gitea_test.rc != 0
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health
|
||||
until: gitea_health.status == 200
|
||||
retries: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
when: traefik_gitea_test.rc != 0
|
||||
|
||||
- name: Restart Traefik to refresh service discovery
|
||||
shell: |
|
||||
docker compose -f {{ traefik_stack_path }}/docker-compose.yml restart traefik
|
||||
when: >
|
||||
traefik_gitea_test.rc != 0
|
||||
and (traefik_auto_restart | default(false) | bool)
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
pause:
|
||||
seconds: 10
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Test Gitea via Traefik
|
||||
uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_test
|
||||
changed_when: false
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Display result
|
||||
debug:
|
||||
msg: |
|
||||
Gitea-Traefik connection test:
|
||||
- Direct connection: {{ 'OK' if traefik_gitea_test.rc == 0 else 'FAILED' }}
|
||||
- Via Traefik: {{ 'OK' if (final_test.status | default(0) == 200) else 'FAILED' if (traefik_restart.changed | default(false) | bool) else 'SKIPPED (no restart)' }}
|
||||
|
||||
{% if traefik_restart.changed | default(false) | bool %}
|
||||
Traefik has been restarted to refresh service discovery.
|
||||
{% elif traefik_gitea_test.rc != 0 %}
|
||||
Note: Traefik restart was skipped (traefik_auto_restart=false). Direct connection test failed.
|
||||
{% endif %}
|
||||
|
||||
@@ -0,0 +1,198 @@
|
||||
---
|
||||
# Backup Before Redeploy
|
||||
# Creates comprehensive backup of Gitea data, SSL certificates, and configurations
|
||||
# before redeploying Traefik and Gitea stacks
|
||||
|
||||
- name: Backup Before Redeploy
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
backup_base_path: "{{ backups_path | default('/home/deploy/backups') }}"
|
||||
backup_name: "redeploy-backup-{{ ansible_date_time.epoch }}"
|
||||
|
||||
tasks:
|
||||
- name: Display backup plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
BACKUP BEFORE REDEPLOY
|
||||
================================================================================
|
||||
|
||||
This playbook will backup:
|
||||
1. Gitea data (volumes)
|
||||
2. SSL certificates (acme.json)
|
||||
3. Gitea configuration (app.ini)
|
||||
4. Traefik configuration
|
||||
5. PostgreSQL data (if applicable)
|
||||
|
||||
Backup location: {{ backup_base_path }}/{{ backup_name }}
|
||||
|
||||
================================================================================
|
||||
|
||||
- name: Ensure backup directory exists
|
||||
ansible.builtin.file:
|
||||
path: "{{ backup_base_path }}/{{ backup_name }}"
|
||||
state: directory
|
||||
mode: '0755'
|
||||
become: yes
|
||||
|
||||
- name: Create backup timestamp file
|
||||
ansible.builtin.copy:
|
||||
content: |
|
||||
Backup created: {{ ansible_date_time.iso8601 }}
|
||||
Backup name: {{ backup_name }}
|
||||
Purpose: Before Traefik/Gitea redeploy
|
||||
dest: "{{ backup_base_path }}/{{ backup_name }}/backup-info.txt"
|
||||
mode: '0644'
|
||||
become: yes
|
||||
|
||||
# ========================================
|
||||
# Backup Gitea Data
|
||||
# ========================================
|
||||
- name: Check Gitea volumes
|
||||
ansible.builtin.shell: |
|
||||
docker volume ls --filter name=gitea --format "{{ '{{' }}.Name{{ '}}' }}"
|
||||
register: gitea_volumes
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Backup Gitea volumes
|
||||
ansible.builtin.shell: |
|
||||
for volume in {{ gitea_volumes.stdout_lines | join(' ') }}; do
|
||||
if [ -n "$volume" ]; then
|
||||
echo "Backing up volume: $volume"
|
||||
docker run --rm \
|
||||
-v "$volume:/source:ro" \
|
||||
-v "{{ backup_base_path }}/{{ backup_name }}:/backup" \
|
||||
alpine tar czf "/backup/gitea-volume-${volume}.tar.gz" -C /source .
|
||||
fi
|
||||
done
|
||||
when: gitea_volumes.stdout_lines | length > 0
|
||||
register: gitea_volumes_backup
|
||||
changed_when: gitea_volumes_backup.rc == 0
|
||||
|
||||
# ========================================
|
||||
# Backup SSL Certificates
|
||||
# ========================================
|
||||
- name: Check if acme.json exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ traefik_stack_path }}/acme.json"
|
||||
register: acme_json_stat
|
||||
|
||||
- name: Backup acme.json
|
||||
ansible.builtin.copy:
|
||||
src: "{{ traefik_stack_path }}/acme.json"
|
||||
dest: "{{ backup_base_path }}/{{ backup_name }}/acme.json"
|
||||
remote_src: yes
|
||||
mode: '0600'
|
||||
when: acme_json_stat.stat.exists
|
||||
register: acme_backup
|
||||
changed_when: acme_backup.changed | default(false)
|
||||
|
||||
# ========================================
|
||||
# Backup Gitea Configuration
|
||||
# ========================================
|
||||
- name: Backup Gitea app.ini
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T gitea cat /data/gitea/conf/app.ini > "{{ backup_base_path }}/{{ backup_name }}/gitea-app.ini" 2>/dev/null || echo "Could not read app.ini"
|
||||
register: gitea_app_ini_backup
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Backup Gitea docker-compose.yml
|
||||
ansible.builtin.copy:
|
||||
src: "{{ gitea_stack_path }}/docker-compose.yml"
|
||||
dest: "{{ backup_base_path }}/{{ backup_name }}/gitea-docker-compose.yml"
|
||||
remote_src: yes
|
||||
mode: '0644'
|
||||
register: gitea_compose_backup
|
||||
changed_when: gitea_compose_backup.changed | default(false)
|
||||
|
||||
# ========================================
|
||||
# Backup Traefik Configuration
|
||||
# ========================================
|
||||
- name: Backup Traefik configuration files
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
tar czf "{{ backup_base_path }}/{{ backup_name }}/traefik-config.tar.gz" \
|
||||
traefik.yml \
|
||||
docker-compose.yml \
|
||||
dynamic/ 2>/dev/null || echo "Some files may be missing"
|
||||
register: traefik_config_backup
|
||||
changed_when: traefik_config_backup.rc == 0
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# Backup PostgreSQL Data (if applicable)
|
||||
# ========================================
|
||||
- name: Check if PostgreSQL stack exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ stacks_base_path }}/postgresql/docker-compose.yml"
|
||||
register: postgres_compose_exists
|
||||
|
||||
- name: Backup PostgreSQL database (if running)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ stacks_base_path }}/postgresql
|
||||
if docker compose ps postgres | grep -q "Up"; then
|
||||
docker compose exec -T postgres pg_dumpall -U postgres | gzip > "{{ backup_base_path }}/{{ backup_name }}/postgresql-all-{{ ansible_date_time.epoch }}.sql.gz"
|
||||
echo "PostgreSQL backup created"
|
||||
else
|
||||
echo "PostgreSQL not running, skipping backup"
|
||||
fi
|
||||
when: postgres_compose_exists.stat.exists
|
||||
register: postgres_backup
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# Verify Backup
|
||||
# ========================================
|
||||
- name: List backup contents
|
||||
ansible.builtin.shell: |
|
||||
ls -lh "{{ backup_base_path }}/{{ backup_name }}/"
|
||||
register: backup_contents
|
||||
changed_when: false
|
||||
|
||||
- name: Calculate backup size
|
||||
ansible.builtin.shell: |
|
||||
du -sh "{{ backup_base_path }}/{{ backup_name }}" | awk '{print $1}'
|
||||
register: backup_size
|
||||
changed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
BACKUP SUMMARY
|
||||
================================================================================
|
||||
|
||||
Backup location: {{ backup_base_path }}/{{ backup_name }}
|
||||
Backup size: {{ backup_size.stdout }}
|
||||
|
||||
Backed up:
|
||||
- Gitea volumes: {% if gitea_volumes_backup.changed %}✅{% else %}ℹ️ No volumes found{% endif %}
|
||||
- SSL certificates (acme.json): {% if acme_backup.changed | default(false) %}✅{% else %}ℹ️ Not found{% endif %}
|
||||
- Gitea app.ini: {% if gitea_app_ini_backup.rc == 0 %}✅{% else %}⚠️ Could not read{% endif %}
|
||||
- Gitea docker-compose.yml: {% if gitea_compose_backup.changed | default(false) %}✅{% else %}ℹ️ Not found{% endif %}
|
||||
- Traefik configuration: {% if traefik_config_backup.rc == 0 %}✅{% else %}⚠️ Some files may be missing{% endif %}
|
||||
- PostgreSQL data: {% if postgres_backup.rc == 0 and 'created' in postgres_backup.stdout %}✅{% else %}ℹ️ Not running or not found{% endif %}
|
||||
|
||||
Backup contents:
|
||||
{{ backup_contents.stdout }}
|
||||
|
||||
================================================================================
|
||||
NEXT STEPS
|
||||
================================================================================
|
||||
|
||||
Backup completed successfully. You can now proceed with redeploy:
|
||||
|
||||
ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name={{ backup_name }}"
|
||||
|
||||
================================================================================
|
||||
|
||||
255
deployment/ansible/playbooks/maintenance/rollback-redeploy.yml
Normal file
255
deployment/ansible/playbooks/maintenance/rollback-redeploy.yml
Normal file
@@ -0,0 +1,255 @@
|
||||
---
|
||||
# Rollback Redeploy
|
||||
# Restores Traefik and Gitea from backup created before redeploy
|
||||
#
|
||||
# Usage:
|
||||
# ansible-playbook -i inventory/production.yml playbooks/maintenance/rollback-redeploy.yml \
|
||||
# --vault-password-file secrets/.vault_pass \
|
||||
# -e "backup_name=redeploy-backup-1234567890"
|
||||
|
||||
- name: Rollback Redeploy
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
backup_base_path: "{{ backups_path | default('/home/deploy/backups') }}"
|
||||
backup_name: "{{ backup_name | default('') }}"
|
||||
|
||||
tasks:
|
||||
- name: Validate backup name
|
||||
ansible.builtin.fail:
|
||||
msg: "backup_name is required. Use: -e 'backup_name=redeploy-backup-1234567890'"
|
||||
when: backup_name == ""
|
||||
|
||||
- name: Check if backup directory exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ backup_base_path }}/{{ backup_name }}"
|
||||
register: backup_dir_stat
|
||||
|
||||
- name: Fail if backup not found
|
||||
ansible.builtin.fail:
|
||||
msg: "Backup directory not found: {{ backup_base_path }}/{{ backup_name }}"
|
||||
when: not backup_dir_stat.stat.exists
|
||||
|
||||
- name: Display rollback plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ROLLBACK REDEPLOY
|
||||
================================================================================
|
||||
|
||||
This playbook will restore from backup: {{ backup_base_path }}/{{ backup_name }}
|
||||
|
||||
Steps:
|
||||
1. Stop Traefik and Gitea stacks
|
||||
2. Restore Gitea volumes
|
||||
3. Restore SSL certificates (acme.json)
|
||||
4. Restore Gitea configuration (app.ini)
|
||||
5. Restore Traefik configuration
|
||||
6. Restore PostgreSQL data (if applicable)
|
||||
7. Restart stacks
|
||||
8. Verify
|
||||
|
||||
⚠️ WARNING: This will overwrite current state!
|
||||
|
||||
================================================================================
|
||||
|
||||
# ========================================
|
||||
# 1. STOP STACKS
|
||||
# ========================================
|
||||
- name: Stop Traefik stack
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose down
|
||||
register: traefik_stop
|
||||
changed_when: traefik_stop.rc == 0
|
||||
failed_when: false
|
||||
|
||||
- name: Stop Gitea stack
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose down
|
||||
register: gitea_stop
|
||||
changed_when: gitea_stop.rc == 0
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 2. RESTORE GITEA VOLUMES
|
||||
# ========================================
|
||||
- name: List Gitea volume backups
|
||||
ansible.builtin.shell: |
|
||||
ls -1 "{{ backup_base_path }}/{{ backup_name }}/gitea-volume-"*.tar.gz 2>/dev/null || echo ""
|
||||
register: gitea_volume_backups
|
||||
changed_when: false
|
||||
|
||||
- name: Restore Gitea volumes
|
||||
ansible.builtin.shell: |
|
||||
for backup_file in {{ backup_base_path }}/{{ backup_name }}/gitea-volume-*.tar.gz; do
|
||||
if [ -f "$backup_file" ]; then
|
||||
volume_name=$(basename "$backup_file" .tar.gz | sed 's/gitea-volume-//')
|
||||
echo "Restoring volume: $volume_name"
|
||||
docker volume create "$volume_name" 2>/dev/null || true
|
||||
docker run --rm \
|
||||
-v "$volume_name:/target" \
|
||||
-v "{{ backup_base_path }}/{{ backup_name }}:/backup:ro" \
|
||||
alpine sh -c "cd /target && tar xzf /backup/$(basename $backup_file)"
|
||||
fi
|
||||
done
|
||||
when: gitea_volume_backups.stdout != ""
|
||||
register: gitea_volumes_restore
|
||||
changed_when: gitea_volumes_restore.rc == 0
|
||||
|
||||
# ========================================
|
||||
# 3. RESTORE SSL CERTIFICATES
|
||||
# ========================================
|
||||
- name: Restore acme.json
|
||||
ansible.builtin.copy:
|
||||
src: "{{ backup_base_path }}/{{ backup_name }}/acme.json"
|
||||
dest: "{{ traefik_stack_path }}/acme.json"
|
||||
remote_src: yes
|
||||
mode: '0600'
|
||||
register: acme_restore
|
||||
changed_when: acme_restore.rc == 0
|
||||
|
||||
# ========================================
|
||||
# 4. RESTORE CONFIGURATIONS
|
||||
# ========================================
|
||||
- name: Restore Gitea docker-compose.yml
|
||||
ansible.builtin.copy:
|
||||
src: "{{ backup_base_path }}/{{ backup_name }}/gitea-docker-compose.yml"
|
||||
dest: "{{ gitea_stack_path }}/docker-compose.yml"
|
||||
remote_src: yes
|
||||
mode: '0644'
|
||||
register: gitea_compose_restore
|
||||
changed_when: gitea_compose_restore.rc == 0
|
||||
failed_when: false
|
||||
|
||||
- name: Restore Traefik configuration
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
tar xzf "{{ backup_base_path }}/{{ backup_name }}/traefik-config.tar.gz" 2>/dev/null || echo "Some files may be missing"
|
||||
register: traefik_config_restore
|
||||
changed_when: traefik_config_restore.rc == 0
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 5. RESTORE POSTGRESQL DATA
|
||||
# ========================================
|
||||
- name: Find PostgreSQL backup
|
||||
ansible.builtin.shell: |
|
||||
ls -1 "{{ backup_base_path }}/{{ backup_name }}/postgresql-all-"*.sql.gz 2>/dev/null | head -1 || echo ""
|
||||
register: postgres_backup_file
|
||||
changed_when: false
|
||||
|
||||
- name: Restore PostgreSQL database
|
||||
ansible.builtin.shell: |
|
||||
cd {{ stacks_base_path }}/postgresql
|
||||
if docker compose ps postgres | grep -q "Up"; then
|
||||
gunzip -c "{{ postgres_backup_file.stdout }}" | docker compose exec -T postgres psql -U postgres
|
||||
echo "PostgreSQL restored"
|
||||
else
|
||||
echo "PostgreSQL not running, skipping restore"
|
||||
fi
|
||||
when: postgres_backup_file.stdout != ""
|
||||
register: postgres_restore
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 6. RESTART STACKS
|
||||
# ========================================
|
||||
- name: Deploy Traefik stack
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ traefik_stack_path }}"
|
||||
state: present
|
||||
pull: always
|
||||
register: traefik_deploy
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps traefik | grep -Eiq "Up|running"
|
||||
register: traefik_ready
|
||||
changed_when: false
|
||||
until: traefik_ready.rc == 0
|
||||
retries: 12
|
||||
delay: 5
|
||||
failed_when: traefik_ready.rc != 0
|
||||
|
||||
- name: Deploy Gitea stack
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ gitea_stack_path }}"
|
||||
state: present
|
||||
pull: always
|
||||
register: gitea_deploy
|
||||
|
||||
- name: Restore Gitea app.ini
|
||||
ansible.builtin.shell: |
|
||||
if [ -f "{{ backup_base_path }}/{{ backup_name }}/gitea-app.ini" ]; then
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T gitea sh -c "cat > /data/gitea/conf/app.ini" < "{{ backup_base_path }}/{{ backup_name }}/gitea-app.ini"
|
||||
docker compose restart gitea
|
||||
echo "app.ini restored and Gitea restarted"
|
||||
else
|
||||
echo "No app.ini backup found"
|
||||
fi
|
||||
register: gitea_app_ini_restore
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps gitea | grep -Eiq "Up|running"
|
||||
register: gitea_ready
|
||||
changed_when: false
|
||||
until: gitea_ready.rc == 0
|
||||
retries: 12
|
||||
delay: 5
|
||||
failed_when: gitea_ready.rc != 0
|
||||
|
||||
# ========================================
|
||||
# 7. VERIFY
|
||||
# ========================================
|
||||
- name: Wait for Gitea to be healthy
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T gitea curl -f http://localhost:3000/api/healthz 2>&1 | grep -q "status.*pass" && echo "HEALTHY" || echo "NOT_HEALTHY"
|
||||
register: gitea_health
|
||||
changed_when: false
|
||||
until: gitea_health.stdout == "HEALTHY"
|
||||
retries: 30
|
||||
delay: 2
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ROLLBACK SUMMARY
|
||||
================================================================================
|
||||
|
||||
Restored from backup: {{ backup_base_path }}/{{ backup_name }}
|
||||
|
||||
Restored:
|
||||
- Gitea volumes: {% if gitea_volumes_restore.changed %}✅{% else %}ℹ️ No volumes to restore{% endif %}
|
||||
- SSL certificates: {% if acme_restore.changed %}✅{% else %}ℹ️ Not found{% endif %}
|
||||
- Gitea docker-compose.yml: {% if gitea_compose_restore.changed %}✅{% else %}ℹ️ Not found{% endif %}
|
||||
- Traefik configuration: {% if traefik_config_restore.rc == 0 %}✅{% else %}⚠️ Some files may be missing{% endif %}
|
||||
- PostgreSQL data: {% if postgres_restore.rc == 0 and 'restored' in postgres_restore.stdout %}✅{% else %}ℹ️ Not restored{% endif %}
|
||||
- Gitea app.ini: {% if gitea_app_ini_restore.rc == 0 and 'restored' in gitea_app_ini_restore.stdout %}✅{% else %}ℹ️ Not found{% endif %}
|
||||
|
||||
Status:
|
||||
- Traefik: {% if traefik_ready.rc == 0 %}✅ Running{% else %}❌ Not running{% endif %}
|
||||
- Gitea: {% if gitea_ready.rc == 0 %}✅ Running{% else %}❌ Not running{% endif %}
|
||||
- Gitea Health: {% if gitea_health.stdout == 'HEALTHY' %}✅ Healthy{% else %}❌ Not healthy{% endif %}
|
||||
|
||||
Next steps:
|
||||
1. Test Gitea: curl -k https://{{ gitea_domain }}/api/healthz
|
||||
2. Check logs if issues: cd {{ gitea_stack_path }} && docker compose logs gitea --tail=50
|
||||
|
||||
================================================================================
|
||||
|
||||
|
||||
294
deployment/ansible/playbooks/manage/gitea.yml
Normal file
294
deployment/ansible/playbooks/manage/gitea.yml
Normal file
@@ -0,0 +1,294 @@
|
||||
---
|
||||
# Consolidated Gitea Management Playbook
|
||||
# Consolidates: fix-gitea-timeouts.yml, fix-gitea-traefik-connection.yml,
|
||||
# fix-gitea-ssl-routing.yml, fix-gitea-servers-transport.yml,
|
||||
# fix-gitea-complete.yml, restart-gitea-complete.yml,
|
||||
# restart-gitea-with-cache.yml
|
||||
#
|
||||
# Usage:
|
||||
# # Restart Gitea
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags restart
|
||||
#
|
||||
# # Fix timeouts (restart Gitea and Traefik)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags fix-timeouts
|
||||
#
|
||||
# # Fix SSL/routing issues
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags fix-ssl
|
||||
#
|
||||
# # Complete fix (runner stop + restart + service discovery)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/gitea.yml --tags complete
|
||||
|
||||
- name: Manage Gitea
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_runner_path: "{{ stacks_base_path }}/../gitea-runner"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
gitea_container_name: "gitea"
|
||||
traefik_container_name: "traefik"
|
||||
|
||||
tasks:
|
||||
- name: Display management plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA MANAGEMENT
|
||||
================================================================================
|
||||
|
||||
Running management tasks with tags: {{ ansible_run_tags | default(['all']) }}
|
||||
|
||||
Available actions:
|
||||
- restart: Restart Gitea container
|
||||
- fix-timeouts: Restart Gitea and Traefik to fix timeouts
|
||||
- fix-ssl: Fix SSL/routing issues
|
||||
- fix-servers-transport: Update ServersTransport configuration
|
||||
- complete: Complete fix (stop runner, restart services, verify)
|
||||
|
||||
================================================================================
|
||||
|
||||
# ========================================
|
||||
# COMPLETE FIX (--tags complete)
|
||||
# ========================================
|
||||
- name: Check Gitea Runner status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_runner_path }}
|
||||
docker compose ps gitea-runner 2>/dev/null || echo "Runner not found"
|
||||
register: runner_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- complete
|
||||
|
||||
- name: Stop Gitea Runner to reduce load
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_runner_path }}
|
||||
docker compose stop gitea-runner
|
||||
register: runner_stop
|
||||
changed_when: runner_stop.rc == 0
|
||||
failed_when: false
|
||||
when: runner_status.rc == 0
|
||||
tags:
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# RESTART GITEA (--tags restart, fix-timeouts, complete)
|
||||
# ========================================
|
||||
- name: Check Gitea container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps {{ gitea_container_name }}
|
||||
register: gitea_status_before
|
||||
changed_when: false
|
||||
tags:
|
||||
- restart
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
- name: Restart Gitea container
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose restart {{ gitea_container_name }}
|
||||
register: gitea_restart
|
||||
changed_when: gitea_restart.rc == 0
|
||||
tags:
|
||||
- restart
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
- name: Wait for Gitea to be ready (direct check)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
for i in {1..30}; do
|
||||
if docker compose exec -T {{ gitea_container_name }} curl -f http://localhost:3000/api/healthz >/dev/null 2>&1; then
|
||||
echo "Gitea is ready"
|
||||
exit 0
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
echo "Gitea not ready after 60 seconds"
|
||||
exit 1
|
||||
register: gitea_ready
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- restart
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# RESTART TRAEFIK (--tags fix-timeouts, complete)
|
||||
# ========================================
|
||||
- name: Check Traefik container status before restart
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps {{ traefik_container_name }}
|
||||
register: traefik_status_before
|
||||
changed_when: false
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
- name: Restart Traefik to refresh service discovery
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose restart {{ traefik_container_name }}
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
when: traefik_auto_restart | default(false) | bool
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# FIX SERVERS TRANSPORT (--tags fix-servers-transport)
|
||||
# ========================================
|
||||
- name: Sync Gitea stack configuration
|
||||
ansible.builtin.synchronize:
|
||||
src: "{{ playbook_dir }}/../../stacks/gitea/"
|
||||
dest: "{{ gitea_stack_path }}/"
|
||||
delete: no
|
||||
recursive: yes
|
||||
rsync_opts:
|
||||
- "--chmod=D755,F644"
|
||||
- "--exclude=.git"
|
||||
- "--exclude=*.log"
|
||||
- "--exclude=data/"
|
||||
- "--exclude=volumes/"
|
||||
tags:
|
||||
- fix-servers-transport
|
||||
|
||||
- name: Restart Gitea container to apply new labels
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose up -d --force-recreate {{ gitea_container_name }}
|
||||
register: gitea_restart_transport
|
||||
changed_when: gitea_restart_transport.rc == 0
|
||||
tags:
|
||||
- fix-servers-transport
|
||||
|
||||
# ========================================
|
||||
# VERIFICATION (--tags fix-timeouts, fix-ssl, complete)
|
||||
# ========================================
|
||||
- name: Wait for Gitea to be reachable via Traefik (with retries)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_via_traefik
|
||||
until: gitea_health_via_traefik.status == 200
|
||||
retries: 15
|
||||
delay: 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: (traefik_restart.changed | default(false) | bool) or (gitea_restart.changed | default(false) | bool)
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- fix-ssl
|
||||
- complete
|
||||
|
||||
- name: Check if Gitea is in Traefik service discovery
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T {{ traefik_container_name }} traefik show providers docker 2>/dev/null | grep -i "gitea" || echo "NOT_FOUND"
|
||||
register: traefik_gitea_service_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
when: (traefik_restart.changed | default(false) | bool) or (gitea_restart.changed | default(false) | bool)
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- fix-ssl
|
||||
- complete
|
||||
|
||||
- name: Final status check
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
tags:
|
||||
- fix-timeouts
|
||||
- fix-ssl
|
||||
- complete
|
||||
|
||||
# ========================================
|
||||
# SUMMARY
|
||||
# ========================================
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA MANAGEMENT SUMMARY
|
||||
================================================================================
|
||||
|
||||
Actions performed:
|
||||
{% if 'complete' in ansible_run_tags %}
|
||||
- Gitea Runner: {% if runner_stop.changed | default(false) %}✅ Stopped{% else %}ℹ️ Not active or not found{% endif %}
|
||||
{% endif %}
|
||||
{% if 'restart' in ansible_run_tags or 'fix-timeouts' in ansible_run_tags or 'complete' in ansible_run_tags %}
|
||||
- Gitea Restart: {% if gitea_restart.changed %}✅ Performed{% else %}ℹ️ Not needed{% endif %}
|
||||
- Gitea Ready: {% if gitea_ready.rc == 0 %}✅ Ready{% else %}❌ Not ready{% endif %}
|
||||
{% endif %}
|
||||
{% if 'fix-timeouts' in ansible_run_tags or 'complete' in ansible_run_tags %}
|
||||
- Traefik Restart: {% if traefik_restart.changed %}✅ Performed{% else %}ℹ️ Not needed (traefik_auto_restart=false){% endif %}
|
||||
{% endif %}
|
||||
{% if 'fix-servers-transport' in ansible_run_tags %}
|
||||
- ServersTransport Update: {% if gitea_restart_transport.changed %}✅ Applied{% else %}ℹ️ Not needed{% endif %}
|
||||
{% endif %}
|
||||
|
||||
Final Status:
|
||||
{% if 'fix-timeouts' in ansible_run_tags or 'fix-ssl' in ansible_run_tags or 'complete' in ansible_run_tags %}
|
||||
- Gitea via Traefik: {% if final_status.status == 200 %}✅ Reachable (Status: 200){% else %}❌ Not reachable (Status: {{ final_status.status | default('TIMEOUT') }}){% endif %}
|
||||
- Traefik Service Discovery: {% if 'NOT_FOUND' not in traefik_gitea_service_check.stdout %}✅ Gitea found{% else %}❌ Gitea not found{% endif %}
|
||||
{% endif %}
|
||||
|
||||
{% if final_status.status == 200 and 'NOT_FOUND' not in traefik_gitea_service_check.stdout %}
|
||||
✅ SUCCESS: Gitea is now reachable via Traefik!
|
||||
URL: {{ gitea_url }}
|
||||
|
||||
Next steps:
|
||||
1. Test Gitea in browser: {{ gitea_url }}
|
||||
{% if 'complete' in ansible_run_tags %}
|
||||
2. If everything is stable, you can reactivate the runner:
|
||||
cd {{ gitea_runner_path }} && docker compose up -d gitea-runner
|
||||
3. Monitor if the runner overloads Gitea again
|
||||
{% endif %}
|
||||
{% else %}
|
||||
⚠️ PROBLEM: Gitea is not fully reachable
|
||||
|
||||
Possible causes:
|
||||
{% if final_status.status != 200 %}
|
||||
- Gitea does not respond via Traefik (Status: {{ final_status.status | default('TIMEOUT') }})
|
||||
{% endif %}
|
||||
{% if 'NOT_FOUND' in traefik_gitea_service_check.stdout %}
|
||||
- Traefik Service Discovery has not recognized Gitea yet
|
||||
{% endif %}
|
||||
|
||||
Next steps:
|
||||
1. Wait 1-2 minutes and test again: curl -k {{ gitea_url }}/api/healthz
|
||||
2. Check Traefik logs: cd {{ traefik_stack_path }} && docker compose logs {{ traefik_container_name }} --tail=50
|
||||
3. Check Gitea logs: cd {{ gitea_stack_path }} && docker compose logs {{ gitea_container_name }} --tail=50
|
||||
4. Run diagnosis: ansible-playbook -i inventory/production.yml playbooks/diagnose/gitea.yml
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
|
||||
162
deployment/ansible/playbooks/manage/traefik.yml
Normal file
162
deployment/ansible/playbooks/manage/traefik.yml
Normal file
@@ -0,0 +1,162 @@
|
||||
---
|
||||
# Consolidated Traefik Management Playbook
|
||||
# Consolidates: stabilize-traefik.yml, disable-traefik-auto-restarts.yml
|
||||
#
|
||||
# Usage:
|
||||
# # Stabilize Traefik (fix acme.json, ensure running, monitor)
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/traefik.yml --tags stabilize
|
||||
#
|
||||
# # Disable auto-restarts
|
||||
# ansible-playbook -i inventory/production.yml playbooks/manage/traefik.yml --tags disable-auto-restart
|
||||
|
||||
- name: Manage Traefik
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
traefik_container_name: "traefik"
|
||||
traefik_stabilize_wait_minutes: "{{ traefik_stabilize_wait_minutes | default(10) }}"
|
||||
traefik_stabilize_check_interval: 60
|
||||
|
||||
tasks:
|
||||
- name: Display management plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK MANAGEMENT
|
||||
================================================================================
|
||||
|
||||
Running management tasks with tags: {{ ansible_run_tags | default(['all']) }}
|
||||
|
||||
Available actions:
|
||||
- stabilize: Fix acme.json, ensure running, monitor stability
|
||||
- disable-auto-restart: Check and document auto-restart mechanisms
|
||||
|
||||
================================================================================
|
||||
|
||||
# ========================================
|
||||
# STABILIZE (--tags stabilize)
|
||||
# ========================================
|
||||
- name: Fix acme.json permissions
|
||||
ansible.builtin.file:
|
||||
path: "{{ traefik_stack_path }}/acme.json"
|
||||
state: file
|
||||
mode: '0600'
|
||||
owner: "{{ ansible_user | default('deploy') }}"
|
||||
group: "{{ ansible_user | default('deploy') }}"
|
||||
register: acme_permissions_fixed
|
||||
tags:
|
||||
- stabilize
|
||||
|
||||
- name: Ensure Traefik container is running
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose up -d {{ traefik_container_name }}
|
||||
register: traefik_start
|
||||
changed_when: traefik_start.rc == 0
|
||||
tags:
|
||||
- stabilize
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
tags:
|
||||
- stabilize
|
||||
|
||||
- name: Monitor Traefik stability
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps {{ traefik_container_name }} --format "{{ '{{' }}.State{{ '}}' }}" | head -1 || echo "UNKNOWN"
|
||||
register: traefik_state_check
|
||||
changed_when: false
|
||||
until: traefik_state_check.stdout == "running"
|
||||
retries: "{{ (traefik_stabilize_wait_minutes | int * 60 / traefik_stabilize_check_interval) | int }}"
|
||||
delay: "{{ traefik_stabilize_check_interval }}"
|
||||
tags:
|
||||
- stabilize
|
||||
|
||||
- name: Check Traefik logs for restarts during monitoring
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs {{ traefik_container_name }} --since "{{ traefik_stabilize_wait_minutes }}m" 2>&1 | grep -iE "stopping server gracefully|I have to go" | wc -l
|
||||
register: restarts_during_monitoring
|
||||
changed_when: false
|
||||
tags:
|
||||
- stabilize
|
||||
|
||||
# ========================================
|
||||
# DISABLE AUTO-RESTART (--tags disable-auto-restart)
|
||||
# ========================================
|
||||
- name: Check Ansible traefik_auto_restart setting
|
||||
ansible.builtin.shell: |
|
||||
grep -r "traefik_auto_restart" /home/deploy/deployment/ansible/inventory/group_vars/ 2>/dev/null | head -5 || echo "No traefik_auto_restart setting found"
|
||||
register: ansible_auto_restart_setting
|
||||
changed_when: false
|
||||
tags:
|
||||
- disable-auto-restart
|
||||
|
||||
- name: Check for cronjobs that restart Traefik
|
||||
ansible.builtin.shell: |
|
||||
(crontab -l 2>/dev/null || true) | grep -E "traefik|docker.*compose.*restart.*traefik|docker.*stop.*traefik" || echo "No cronjobs found"
|
||||
register: traefik_cronjobs
|
||||
changed_when: false
|
||||
tags:
|
||||
- disable-auto-restart
|
||||
|
||||
- name: Check systemd timers for Traefik
|
||||
ansible.builtin.shell: |
|
||||
systemctl list-timers --all --no-pager | grep -E "traefik|docker.*compose.*traefik" || echo "No Traefik-related timers"
|
||||
register: traefik_timers
|
||||
changed_when: false
|
||||
tags:
|
||||
- disable-auto-restart
|
||||
|
||||
# ========================================
|
||||
# SUMMARY
|
||||
# ========================================
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK MANAGEMENT SUMMARY
|
||||
================================================================================
|
||||
|
||||
{% if 'stabilize' in ansible_run_tags %}
|
||||
Stabilization:
|
||||
- acme.json permissions: {% if acme_permissions_fixed.changed %}✅ Fixed{% else %}ℹ️ Already correct{% endif %}
|
||||
- Traefik started: {% if traefik_start.changed %}✅ Started{% else %}ℹ️ Already running{% endif %}
|
||||
- Stability monitoring: {{ traefik_stabilize_wait_minutes }} minutes
|
||||
- Restarts during monitoring: {{ restarts_during_monitoring.stdout | default('0') }}
|
||||
|
||||
{% if (restarts_during_monitoring.stdout | default('0') | int) == 0 %}
|
||||
✅ Traefik ran stable during monitoring period!
|
||||
{% else %}
|
||||
⚠️ {{ restarts_during_monitoring.stdout }} restarts detected during monitoring
|
||||
→ Run diagnosis: ansible-playbook -i inventory/production.yml playbooks/diagnose/traefik.yml --tags restart-source
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
{% if 'disable-auto-restart' in ansible_run_tags %}
|
||||
Auto-Restart Analysis:
|
||||
- Ansible setting: {{ ansible_auto_restart_setting.stdout | default('Not found') }}
|
||||
- Cronjobs: {{ traefik_cronjobs.stdout | default('None found') }}
|
||||
- Systemd timers: {{ traefik_timers.stdout | default('None found') }}
|
||||
|
||||
Recommendations:
|
||||
{% if 'traefik_auto_restart.*true' in ansible_auto_restart_setting.stdout %}
|
||||
- Set traefik_auto_restart: false in group_vars
|
||||
{% endif %}
|
||||
{% if 'No cronjobs' not in traefik_cronjobs.stdout %}
|
||||
- Remove or disable cronjobs that restart Traefik
|
||||
{% endif %}
|
||||
{% if 'No Traefik-related timers' not in traefik_timers.stdout %}
|
||||
- Disable systemd timers that restart Traefik
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
|
||||
@@ -1,141 +0,0 @@
|
||||
---
|
||||
# Monitor Traefik Continuously
|
||||
# Überwacht Traefik-Logs und Docker Events in Echtzeit um Restart-Quelle zu finden
|
||||
- name: Monitor Traefik Continuously
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
monitor_duration_minutes: 30 # Standard: 30 Minuten, kann überschrieben werden
|
||||
|
||||
tasks:
|
||||
- name: Display monitoring information
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK CONTINUOUS MONITORING
|
||||
================================================================================
|
||||
|
||||
Überwachungsdauer: {{ monitor_duration_minutes }} Minuten
|
||||
|
||||
Überwacht:
|
||||
1. Traefik-Logs auf "Stopping server gracefully" / "I have to go"
|
||||
2. Docker Events für Traefik-Container
|
||||
3. Docker Daemon Logs für Container-Stops
|
||||
|
||||
Starte Monitoring...
|
||||
================================================================================
|
||||
|
||||
- name: Get initial Traefik status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }} {{ '{{' }}.State.StartedAt{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: initial_status
|
||||
changed_when: false
|
||||
|
||||
- name: Start monitoring Traefik logs in background
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
timeout {{ monitor_duration_minutes * 60 }} docker compose logs -f traefik 2>&1 | grep --line-buffered -iE "stopping server gracefully|I have to go" | while read line; do
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $line"
|
||||
done > /tmp/traefik_monitor_$$.log 2>&1 &
|
||||
echo $!
|
||||
register: log_monitor_pid
|
||||
changed_when: false
|
||||
async: "{{ monitor_duration_minutes * 60 + 60 }}"
|
||||
poll: 0
|
||||
|
||||
- name: Start monitoring Docker events in background
|
||||
ansible.builtin.shell: |
|
||||
timeout {{ monitor_duration_minutes * 60 }} docker events --filter container=traefik --filter event=die --format "[{{ '{{' }}.Time{{ '}}' }}] {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.name{{ '}}' }}" 2>&1 | tee /tmp/traefik_docker_events_$$.log &
|
||||
echo $!
|
||||
register: docker_events_pid
|
||||
changed_when: false
|
||||
async: "{{ monitor_duration_minutes * 60 + 60 }}"
|
||||
poll: 0
|
||||
|
||||
- name: Wait for monitoring period
|
||||
ansible.builtin.pause:
|
||||
minutes: "{{ monitor_duration_minutes }}"
|
||||
|
||||
- name: Stop log monitoring
|
||||
ansible.builtin.shell: |
|
||||
pkill -f "docker compose logs.*traefik" || true
|
||||
sleep 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Stop Docker events monitoring
|
||||
ansible.builtin.shell: |
|
||||
pkill -f "docker events.*traefik" || true
|
||||
sleep 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Read Traefik log monitoring results
|
||||
ansible.builtin.slurp:
|
||||
src: "{{ item }}"
|
||||
register: log_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
loop: "{{ log_monitor_pid.stdout_lines | map('regex_replace', '^.*', '/tmp/traefik_monitor_' + ansible_date_time.epoch + '.log') | list }}"
|
||||
|
||||
- name: Read Docker events monitoring results
|
||||
ansible.builtin.slurp:
|
||||
src: "{{ item }}"
|
||||
register: docker_events_results
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
loop: "{{ docker_events_pid.stdout_lines | map('regex_replace', '^.*', '/tmp/traefik_docker_events_' + ansible_date_time.epoch + '.log') | list }}"
|
||||
|
||||
- name: Get final Traefik status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }} {{ '{{' }}.State.StartedAt{{ '}}' }} {{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: final_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik logs for stop messages during monitoring
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik --since {{ monitor_duration_minutes }}m 2>&1 | grep -iE "stopping server gracefully|I have to go" || echo "Keine Stop-Meldungen gefunden"
|
||||
register: traefik_stop_messages
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
MONITORING ZUSAMMENFASSUNG ({{ monitor_duration_minutes }} Minuten):
|
||||
================================================================================
|
||||
|
||||
Initial Status: {{ initial_status.stdout }}
|
||||
Final Status: {{ final_status.stdout }}
|
||||
|
||||
Traefik Stop-Meldungen während Monitoring:
|
||||
{% if traefik_stop_messages.stdout and 'Keine Stop-Meldungen' not in traefik_stop_messages.stdout %}
|
||||
❌ STOP-MELDUNGEN GEFUNDEN:
|
||||
{{ traefik_stop_messages.stdout }}
|
||||
|
||||
⚠️ PROBLEM BESTÄTIGT: Traefik wurde während des Monitorings gestoppt!
|
||||
|
||||
Nächste Schritte:
|
||||
1. Prüfe Docker Events Log: /tmp/traefik_docker_events_*.log
|
||||
2. Prüfe Traefik Log Monitor: /tmp/traefik_monitor_*.log
|
||||
3. Prüfe wer den Stop-Befehl ausgeführt hat:
|
||||
- journalctl -u docker.service --since "{{ monitor_duration_minutes }} minutes ago"
|
||||
- docker events --since "{{ monitor_duration_minutes }} minutes ago" --filter container=traefik
|
||||
{% else %}
|
||||
✅ KEINE STOP-MELDUNGEN GEFUNDEN
|
||||
|
||||
Traefik lief stabil während des {{ monitor_duration_minutes }}-minütigen Monitorings.
|
||||
|
||||
{% if initial_status.stdout != final_status.stdout %}
|
||||
⚠️ Status hat sich geändert:
|
||||
- Vorher: {{ initial_status.stdout }}
|
||||
- Nachher: {{ final_status.stdout }}
|
||||
{% endif %}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,150 +0,0 @@
|
||||
---
|
||||
# Monitor Traefik for Unexpected Restarts
|
||||
# Überwacht Traefik-Logs auf "I have to go..." Meldungen und identifiziert die Ursache
|
||||
- name: Monitor Traefik Restarts
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
|
||||
vars:
|
||||
monitor_lookback_hours: "{{ monitor_lookback_hours | default(24) }}"
|
||||
|
||||
tasks:
|
||||
- name: Check Traefik logs for "I have to go..." messages
|
||||
ansible.builtin.shell: |
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose logs traefik --since {{ monitor_lookback_hours }}h 2>&1 | grep -E "I have to go|Stopping server gracefully" | tail -20 || echo "No stop messages found"
|
||||
register: traefik_stop_messages
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik stop messages
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Stop-Meldungen (letzte {{ monitor_lookback_hours }} Stunden):
|
||||
================================================================================
|
||||
{{ traefik_stop_messages.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik container restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "0"
|
||||
register: traefik_restart_count
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container start time
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.StartedAt{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: traefik_started_at
|
||||
changed_when: false
|
||||
|
||||
- name: Check Docker events for Traefik stops
|
||||
ansible.builtin.shell: |
|
||||
timeout 5 docker events --since {{ monitor_lookback_hours }}h --filter container=traefik --filter event=die --format "{{ '{{' }}.Time{{ '}}' }} {{ '{{' }}.Action{{ '}}' }} {{ '{{' }}.Actor.Attributes.name{{ '}}' }}" 2>/dev/null | tail -20 || echo "No stop events found or docker events not available"
|
||||
register: traefik_stop_events
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik stop events
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Stop-Events für Traefik (letzte {{ monitor_lookback_hours }} Stunden):
|
||||
================================================================================
|
||||
{{ traefik_stop_events.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for manual docker compose commands in history
|
||||
ansible.builtin.shell: |
|
||||
history | grep -E "docker.*compose.*traefik.*(restart|stop|down|up)" | tail -10 || echo "No manual docker compose commands found in history"
|
||||
register: manual_commands
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display manual docker compose commands
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Manuelle Docker Compose Befehle (aus History):
|
||||
================================================================================
|
||||
{{ manual_commands.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check systemd docker service status
|
||||
ansible.builtin.shell: |
|
||||
systemctl status docker.service --no-pager -l | head -20 || echo "Could not check docker service status"
|
||||
register: docker_service_status
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Docker service status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Docker Service Status:
|
||||
================================================================================
|
||||
{{ docker_service_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check for system reboots
|
||||
ansible.builtin.shell: |
|
||||
last reboot --since "{{ monitor_lookback_hours }} hours ago" 2>/dev/null | head -5 || echo "No reboots in the last {{ monitor_lookback_hours }} hours"
|
||||
register: reboots
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display reboot history
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
System Reboots (letzte {{ monitor_lookback_hours }} Stunden):
|
||||
================================================================================
|
||||
{{ reboots.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Analyze stop message timestamps
|
||||
ansible.builtin.set_fact:
|
||||
stop_timestamps: "{{ traefik_stop_messages.stdout | regex_findall('\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}') }}"
|
||||
|
||||
- name: Count stop messages
|
||||
ansible.builtin.set_fact:
|
||||
stop_count: "{{ stop_timestamps | length | int }}"
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Traefik Restart Monitoring:
|
||||
================================================================================
|
||||
|
||||
Überwachungszeitraum: Letzte {{ monitor_lookback_hours }} Stunden
|
||||
|
||||
Traefik Status:
|
||||
- Restart Count: {{ traefik_restart_count.stdout }}
|
||||
- Gestartet um: {{ traefik_started_at.stdout }}
|
||||
- Stop-Meldungen gefunden: {{ stop_count | default(0) }}
|
||||
|
||||
{% if (stop_count | default(0) | int) > 0 %}
|
||||
⚠️ {{ stop_count }} Stop-Meldungen gefunden:
|
||||
{{ traefik_stop_messages.stdout }}
|
||||
|
||||
Mögliche Ursachen:
|
||||
{% if reboots.stdout and 'No reboots' not in reboots.stdout %}
|
||||
1. System-Reboots: {{ reboots.stdout }}
|
||||
{% endif %}
|
||||
{% if traefik_stop_events.stdout and 'No stop events' not in traefik_stop_events.stdout %}
|
||||
2. Docker Stop-Events: {{ traefik_stop_events.stdout }}
|
||||
{% endif %}
|
||||
{% if manual_commands.stdout and 'No manual' not in manual_commands.stdout %}
|
||||
3. Manuelle Befehle: {{ manual_commands.stdout }}
|
||||
{% endif %}
|
||||
|
||||
Nächste Schritte:
|
||||
- Prüfe ob die Stop-Meldungen mit unseren manuellen Restarts übereinstimmen
|
||||
- Prüfe ob System-Reboots die Ursache sind
|
||||
- Prüfe Docker-Service-Logs für automatische Stops
|
||||
{% else %}
|
||||
✅ Keine Stop-Meldungen in den letzten {{ monitor_lookback_hours }} Stunden
|
||||
Traefik läuft stabil!
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
@@ -1,95 +0,0 @@
|
||||
---
|
||||
# Restart Gitea Complete - Stoppt und startet Gitea neu um alle Konfigurationsänderungen zu übernehmen
|
||||
- name: Restart Gitea Complete
|
||||
hosts: production
|
||||
gather_facts: no
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Check current Gitea environment variables
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T gitea env | grep -E 'GITEA__database__' | sort || echo "Could not read environment variables"
|
||||
register: gitea_env_before
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display current environment variables
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
Current Gitea Database Environment Variables:
|
||||
{{ gitea_env_before.stdout }}
|
||||
|
||||
- name: Stop Gitea container completely
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose stop gitea
|
||||
register: gitea_stop
|
||||
changed_when: gitea_stop.rc == 0
|
||||
|
||||
- name: Wait for Gitea to stop
|
||||
ansible.builtin.pause:
|
||||
seconds: 5
|
||||
|
||||
- name: Start Gitea container
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose up -d gitea
|
||||
register: gitea_start
|
||||
changed_when: gitea_start.rc == 0
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 60
|
||||
delay: 5
|
||||
|
||||
- name: Check Gitea health after restart
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_after
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
retries: 5
|
||||
delay: 5
|
||||
|
||||
- name: Check environment variables after restart
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T gitea env | grep -E 'GITEA__database__' | sort || echo "Could not read environment variables"
|
||||
register: gitea_env_after
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display restart results
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA COMPLETE RESTART - RESULTS
|
||||
================================================================================
|
||||
|
||||
Gitea Health After Restart:
|
||||
- Status: {{ gitea_health_after.status | default('TIMEOUT') }}
|
||||
{% if gitea_health_after.status | default(0) == 200 %}
|
||||
✅ Gitea is healthy after restart
|
||||
{% else %}
|
||||
❌ Gitea health check failed (Status: {{ gitea_health_after.status | default('TIMEOUT') }})
|
||||
{% endif %}
|
||||
|
||||
Environment Variables After Restart:
|
||||
{{ gitea_env_after.stdout }}
|
||||
|
||||
{% if 'MAX_OPEN_CONNS' in gitea_env_after.stdout %}
|
||||
✅ Connection pool settings are present
|
||||
{% else %}
|
||||
⚠️ Connection pool settings NOT found in environment variables
|
||||
→ Check docker-compose.yml configuration
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,57 +0,0 @@
|
||||
---
|
||||
# Ansible Playbook: Restart Gitea with Redis Cache Enabled
|
||||
# Purpose: Restart Gitea container to apply new cache configuration from docker-compose.yml
|
||||
# Usage:
|
||||
# ansible-playbook -i inventory/production.yml playbooks/restart-gitea-with-cache.yml
|
||||
|
||||
- name: Restart Gitea with Redis Cache Enabled
|
||||
hosts: production
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Verify Gitea container exists
|
||||
shell: |
|
||||
docker compose -f {{ gitea_stack_path }}/docker-compose.yml ps gitea | grep -q "gitea"
|
||||
register: gitea_exists
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Fail if Gitea container does not exist
|
||||
fail:
|
||||
msg: "Gitea container does not exist. Please deploy Gitea stack first."
|
||||
when: gitea_exists.rc != 0
|
||||
|
||||
- name: Recreate Gitea container with new cache configuration
|
||||
shell: |
|
||||
cd {{ gitea_stack_path }} && \
|
||||
docker compose up -d --force-recreate gitea
|
||||
register: gitea_recreated
|
||||
|
||||
- name: Wait for Gitea to be ready after restart
|
||||
uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_health_after_restart
|
||||
until: gitea_health_after_restart.status == 200
|
||||
retries: 30
|
||||
delay: 5
|
||||
changed_when: false
|
||||
|
||||
- name: Display success message
|
||||
debug:
|
||||
msg: |
|
||||
Gitea has been restarted successfully with Redis cache enabled!
|
||||
|
||||
Cache configuration:
|
||||
- ENABLED: true
|
||||
- ADAPTER: redis
|
||||
- HOST: redis:6379
|
||||
- DB: 0
|
||||
|
||||
Gitea should now use Redis for caching, improving performance.
|
||||
|
||||
210
deployment/ansible/playbooks/setup/REDEPLOY_GUIDE.md
Normal file
210
deployment/ansible/playbooks/setup/REDEPLOY_GUIDE.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Traefik/Gitea Redeploy Guide
|
||||
|
||||
This guide explains how to perform a clean redeployment of Traefik and Gitea stacks.
|
||||
|
||||
## Overview
|
||||
|
||||
A clean redeploy:
|
||||
- Stops and removes containers (preserves volumes and SSL certificates)
|
||||
- Syncs latest configurations
|
||||
- Redeploys stacks with fresh containers
|
||||
- Restores configurations
|
||||
- Verifies service discovery
|
||||
|
||||
**Expected downtime**: ~2-5 minutes
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Ansible installed locally
|
||||
- SSH access to production server
|
||||
- Vault password file: `deployment/ansible/secrets/.vault_pass`
|
||||
|
||||
## Step-by-Step Guide
|
||||
|
||||
### Step 1: Backup
|
||||
|
||||
**Automatic backup (recommended):**
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/maintenance/backup-before-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**Manual backup:**
|
||||
```bash
|
||||
# On server
|
||||
cd /home/deploy/deployment/stacks
|
||||
docker compose -f gitea/docker-compose.yml exec gitea cat /data/gitea/conf/app.ini > /tmp/gitea-app.ini.backup
|
||||
cp traefik/acme.json /tmp/acme.json.backup
|
||||
```
|
||||
|
||||
### Step 2: Verify Backup
|
||||
|
||||
Check backup contents:
|
||||
```bash
|
||||
# Backup location will be shown in output
|
||||
ls -lh /home/deploy/backups/redeploy-backup-*/
|
||||
```
|
||||
|
||||
Verify:
|
||||
- `acme.json` exists
|
||||
- `gitea-app.ini` exists
|
||||
- `gitea-volume-*.tar.gz` exists (if volumes were backed up)
|
||||
|
||||
### Step 3: Redeploy
|
||||
|
||||
**With automatic backup:**
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**With existing backup:**
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890" \
|
||||
-e "skip_backup=true"
|
||||
```
|
||||
|
||||
### Step 4: Verify Deployment
|
||||
|
||||
**Check Gitea accessibility:**
|
||||
```bash
|
||||
curl -k https://git.michaelschiemer.de/api/healthz
|
||||
```
|
||||
|
||||
**Check Traefik service discovery:**
|
||||
```bash
|
||||
# On server
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose exec traefik traefik show providers docker | grep -i gitea
|
||||
```
|
||||
|
||||
**Check container status:**
|
||||
```bash
|
||||
# On server
|
||||
docker ps | grep -E "traefik|gitea"
|
||||
```
|
||||
|
||||
### Step 5: Troubleshooting
|
||||
|
||||
**If Gitea is not reachable:**
|
||||
|
||||
1. Check Gitea logs:
|
||||
```bash
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose logs gitea --tail=50
|
||||
```
|
||||
|
||||
2. Check Traefik logs:
|
||||
```bash
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose logs traefik --tail=50
|
||||
```
|
||||
|
||||
3. Check service discovery:
|
||||
```bash
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose exec traefik traefik show providers docker
|
||||
```
|
||||
|
||||
4. Run diagnosis:
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/diagnose/gitea.yml \
|
||||
--vault-password-file secrets/.vault_pass
|
||||
```
|
||||
|
||||
**If SSL certificate issues:**
|
||||
|
||||
1. Check acme.json permissions:
|
||||
```bash
|
||||
ls -l /home/deploy/deployment/stacks/traefik/acme.json
|
||||
# Should be: -rw------- (600)
|
||||
```
|
||||
|
||||
2. Check Traefik ACME logs:
|
||||
```bash
|
||||
cd /home/deploy/deployment/stacks/traefik
|
||||
docker compose logs traefik | grep -i acme
|
||||
```
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If something goes wrong, rollback to the backup:
|
||||
|
||||
```bash
|
||||
cd deployment/ansible
|
||||
ansible-playbook -i inventory/production.yml \
|
||||
playbooks/maintenance/rollback-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name=redeploy-backup-1234567890"
|
||||
```
|
||||
|
||||
Replace `redeploy-backup-1234567890` with the actual backup name from Step 1.
|
||||
|
||||
## What Gets Preserved
|
||||
|
||||
- ✅ Gitea data (volumes)
|
||||
- ✅ SSL certificates (acme.json)
|
||||
- ✅ Gitea configuration (app.ini)
|
||||
- ✅ Traefik configuration
|
||||
- ✅ PostgreSQL data (if applicable)
|
||||
|
||||
## What Gets Recreated
|
||||
|
||||
- 🔄 Traefik container
|
||||
- 🔄 Gitea container
|
||||
- 🔄 Service discovery
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue: Gitea returns 404 after redeploy
|
||||
|
||||
**Solution:**
|
||||
1. Wait 1-2 minutes for service discovery
|
||||
2. Restart Traefik: `cd /home/deploy/deployment/stacks/traefik && docker compose restart traefik`
|
||||
3. Check if Gitea is in traefik-public network: `docker network inspect traefik-public | grep gitea`
|
||||
|
||||
### Issue: SSL certificate errors
|
||||
|
||||
**Solution:**
|
||||
1. Verify acme.json permissions: `chmod 600 /home/deploy/deployment/stacks/traefik/acme.json`
|
||||
2. Check Traefik logs for ACME errors
|
||||
3. Wait 5-10 minutes for certificate renewal
|
||||
|
||||
### Issue: Gitea configuration lost
|
||||
|
||||
**Solution:**
|
||||
1. Restore from backup: `playbooks/maintenance/rollback-redeploy.yml`
|
||||
2. Or manually restore app.ini:
|
||||
```bash
|
||||
cd /home/deploy/deployment/stacks/gitea
|
||||
docker compose exec gitea sh -c "cat > /data/gitea/conf/app.ini" < /path/to/backup/gitea-app.ini
|
||||
docker compose restart gitea
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always backup before redeploy** - Use automatic backup
|
||||
2. **Test in staging first** - If available
|
||||
3. **Monitor during deployment** - Watch logs in separate terminal
|
||||
4. **Have rollback ready** - Know backup name before starting
|
||||
5. **Verify after deployment** - Check all services are accessible
|
||||
|
||||
## Related Playbooks
|
||||
|
||||
- `playbooks/maintenance/backup-before-redeploy.yml` - Create backup
|
||||
- `playbooks/setup/redeploy-traefik-gitea-clean.yml` - Perform redeploy
|
||||
- `playbooks/maintenance/rollback-redeploy.yml` - Rollback from backup
|
||||
- `playbooks/diagnose/gitea.yml` - Diagnose Gitea issues
|
||||
- `playbooks/diagnose/traefik.yml` - Diagnose Traefik issues
|
||||
|
||||
|
||||
@@ -0,0 +1,321 @@
|
||||
---
|
||||
# Clean Redeploy Traefik and Gitea Stacks
|
||||
# Complete redeployment with backup, container recreation, and verification
|
||||
#
|
||||
# Usage:
|
||||
# # With automatic backup
|
||||
# ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
# --vault-password-file secrets/.vault_pass
|
||||
#
|
||||
# # With existing backup
|
||||
# ansible-playbook -i inventory/production.yml playbooks/setup/redeploy-traefik-gitea-clean.yml \
|
||||
# --vault-password-file secrets/.vault_pass \
|
||||
# -e "backup_name=redeploy-backup-1234567890" \
|
||||
# -e "skip_backup=true"
|
||||
|
||||
- name: Clean Redeploy Traefik and Gitea
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
traefik_container_name: "traefik"
|
||||
gitea_container_name: "gitea"
|
||||
backup_base_path: "{{ backups_path | default('/home/deploy/backups') }}"
|
||||
skip_backup: "{{ skip_backup | default(false) | bool }}"
|
||||
backup_name: "{{ backup_name | default('') }}"
|
||||
|
||||
tasks:
|
||||
# ========================================
|
||||
# 1. BACKUP (unless skipped)
|
||||
# ========================================
|
||||
- name: Set backup name fact
|
||||
ansible.builtin.set_fact:
|
||||
actual_backup_name: "{{ backup_name | default('redeploy-backup-' + ansible_date_time.epoch) }}"
|
||||
when: not skip_backup
|
||||
|
||||
- name: Display backup note
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
⚠️ NOTE: Backup should be run separately before redeploy:
|
||||
ansible-playbook -i inventory/production.yml playbooks/maintenance/backup-before-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name={{ actual_backup_name }}"
|
||||
|
||||
Or use existing backup with: -e "backup_name=redeploy-backup-XXXXX" -e "skip_backup=true"
|
||||
when: not skip_backup
|
||||
|
||||
- name: Display redeployment plan
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
CLEAN REDEPLOY TRAEFIK AND GITEA
|
||||
================================================================================
|
||||
|
||||
This playbook will:
|
||||
1. ✅ Backup ({% if skip_backup %}SKIPPED{% else %}Performed{% endif %})
|
||||
2. ✅ Stop and remove Traefik containers (keeps acme.json)
|
||||
3. ✅ Stop and remove Gitea containers (keeps volumes/data)
|
||||
4. ✅ Sync latest stack configurations
|
||||
5. ✅ Redeploy Traefik stack
|
||||
6. ✅ Redeploy Gitea stack
|
||||
7. ✅ Restore Gitea configuration (app.ini)
|
||||
8. ✅ Verify service discovery
|
||||
9. ✅ Test Gitea accessibility
|
||||
|
||||
⚠️ IMPORTANT:
|
||||
- SSL certificates (acme.json) will be preserved
|
||||
- Gitea data (volumes) will be preserved
|
||||
- Only containers will be recreated
|
||||
- Expected downtime: ~2-5 minutes
|
||||
{% if not skip_backup %}
|
||||
- Backup location: {{ backup_base_path }}/{{ actual_backup_name }}
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
# ========================================
|
||||
# 2. STOP AND REMOVE CONTAINERS
|
||||
# ========================================
|
||||
- name: Stop Traefik stack
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose down
|
||||
register: traefik_stop
|
||||
changed_when: traefik_stop.rc == 0
|
||||
failed_when: false
|
||||
|
||||
- name: Remove Traefik containers (if any remain)
|
||||
ansible.builtin.shell: |
|
||||
docker ps -a --filter "name={{ traefik_container_name }}" --format "{{ '{{' }}.ID{{ '}}' }}" | xargs -r docker rm -f 2>/dev/null || true
|
||||
register: traefik_remove
|
||||
changed_when: traefik_remove.rc == 0
|
||||
failed_when: false
|
||||
|
||||
- name: Stop Gitea stack (preserves volumes)
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose down
|
||||
register: gitea_stop
|
||||
changed_when: gitea_stop.rc == 0
|
||||
failed_when: false
|
||||
|
||||
- name: Remove Gitea containers (if any remain, volumes are preserved)
|
||||
ansible.builtin.shell: |
|
||||
docker ps -a --filter "name={{ gitea_container_name }}" --format "{{ '{{' }}.ID{{ '}}' }}" | xargs -r docker rm -f 2>/dev/null || true
|
||||
register: gitea_remove
|
||||
changed_when: gitea_remove.rc == 0
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 3. SYNC CONFIGURATIONS
|
||||
# ========================================
|
||||
- name: Get stacks directory path
|
||||
ansible.builtin.set_fact:
|
||||
stacks_source_path: "{{ playbook_dir | dirname | dirname | dirname }}/stacks"
|
||||
delegate_to: localhost
|
||||
run_once: true
|
||||
|
||||
- name: Sync stacks directory to production server
|
||||
ansible.builtin.synchronize:
|
||||
src: "{{ stacks_source_path }}/"
|
||||
dest: "{{ stacks_base_path }}/"
|
||||
delete: no
|
||||
recursive: yes
|
||||
rsync_opts:
|
||||
- "--chmod=D755,F644"
|
||||
- "--exclude=.git"
|
||||
- "--exclude=*.log"
|
||||
- "--exclude=data/"
|
||||
- "--exclude=volumes/"
|
||||
- "--exclude=acme.json" # Preserve SSL certificates
|
||||
- "--exclude=*.key"
|
||||
- "--exclude=*.pem"
|
||||
|
||||
# ========================================
|
||||
# 4. ENSURE ACME.JSON EXISTS
|
||||
# ========================================
|
||||
- name: Check if acme.json exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ traefik_stack_path }}/acme.json"
|
||||
register: acme_json_stat
|
||||
|
||||
- name: Ensure acme.json exists and has correct permissions
|
||||
ansible.builtin.file:
|
||||
path: "{{ traefik_stack_path }}/acme.json"
|
||||
state: touch
|
||||
mode: '0600'
|
||||
owner: "{{ ansible_user }}"
|
||||
group: "{{ ansible_user }}"
|
||||
become: yes
|
||||
register: acme_json_ensure
|
||||
|
||||
# ========================================
|
||||
# 5. REDEPLOY TRAEFIK
|
||||
# ========================================
|
||||
- name: Deploy Traefik stack
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ traefik_stack_path }}"
|
||||
state: present
|
||||
pull: always
|
||||
register: traefik_deploy
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose ps {{ traefik_container_name }} | grep -Eiq "Up|running"
|
||||
register: traefik_ready
|
||||
changed_when: false
|
||||
until: traefik_ready.rc == 0
|
||||
retries: 12
|
||||
delay: 5
|
||||
failed_when: traefik_ready.rc != 0
|
||||
|
||||
# ========================================
|
||||
# 6. REDEPLOY GITEA
|
||||
# ========================================
|
||||
- name: Deploy Gitea stack
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: "{{ gitea_stack_path }}"
|
||||
state: present
|
||||
pull: always
|
||||
register: gitea_deploy
|
||||
|
||||
- name: Wait for Gitea to be ready
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose ps {{ gitea_container_name }} | grep -Eiq "Up|running"
|
||||
register: gitea_ready
|
||||
changed_when: false
|
||||
until: gitea_ready.rc == 0
|
||||
retries: 12
|
||||
delay: 5
|
||||
failed_when: gitea_ready.rc != 0
|
||||
|
||||
- name: Wait for Gitea to be healthy
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} curl -f http://localhost:3000/api/healthz 2>&1 | grep -q "status.*pass" && echo "HEALTHY" || echo "NOT_HEALTHY"
|
||||
register: gitea_health
|
||||
changed_when: false
|
||||
until: gitea_health.stdout == "HEALTHY"
|
||||
retries: 30
|
||||
delay: 2
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 7. RESTORE GITEA CONFIGURATION
|
||||
# ========================================
|
||||
- name: Restore Gitea app.ini from backup
|
||||
ansible.builtin.shell: |
|
||||
if [ -f "{{ backup_base_path }}/{{ actual_backup_name }}/gitea-app.ini" ]; then
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose exec -T {{ gitea_container_name }} sh -c "cat > /data/gitea/conf/app.ini" < "{{ backup_base_path }}/{{ actual_backup_name }}/gitea-app.ini"
|
||||
docker compose restart {{ gitea_container_name }}
|
||||
echo "app.ini restored and Gitea restarted"
|
||||
else
|
||||
echo "No app.ini backup found, using default configuration"
|
||||
fi
|
||||
when: not skip_backup
|
||||
register: gitea_app_ini_restore
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 8. VERIFY SERVICE DISCOVERY
|
||||
# ========================================
|
||||
- name: Wait for service discovery (Traefik needs time to discover Gitea)
|
||||
ansible.builtin.pause:
|
||||
seconds: 15
|
||||
|
||||
- name: Check if Gitea is in traefik-public network
|
||||
ansible.builtin.shell: |
|
||||
docker network inspect traefik-public --format '{{ '{{' }}range .Containers{{ '}}' }}{{ '{{' }}.Name{{ '}}' }} {{ '{{' }}end{{ '}}' }}' 2>/dev/null | grep -q {{ gitea_container_name }} && echo "YES" || echo "NO"
|
||||
register: gitea_in_network
|
||||
changed_when: false
|
||||
|
||||
- name: Test direct connection from Traefik to Gitea
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose exec -T {{ traefik_container_name }} wget -qO- --timeout=5 http://{{ gitea_container_name }}:3000/api/healthz 2>&1 | head -5 || echo "CONNECTION_FAILED"
|
||||
register: traefik_gitea_direct
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# ========================================
|
||||
# 9. FINAL VERIFICATION
|
||||
# ========================================
|
||||
- name: Test Gitea via HTTPS (with retries)
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_https_test
|
||||
until: gitea_https_test.status == 200
|
||||
retries: 20
|
||||
delay: 3
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check SSL certificate status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
if [ -f acme.json ] && [ -s acme.json ]; then
|
||||
echo "SSL certificates: PRESENT"
|
||||
else
|
||||
echo "SSL certificates: MISSING or EMPTY"
|
||||
fi
|
||||
register: ssl_status
|
||||
changed_when: false
|
||||
|
||||
- name: Final status summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
REDEPLOYMENT SUMMARY
|
||||
================================================================================
|
||||
|
||||
Traefik:
|
||||
- Status: {{ traefik_ready.rc | ternary('Up', 'Down') }}
|
||||
- SSL Certificates: {{ ssl_status.stdout }}
|
||||
|
||||
Gitea:
|
||||
- Status: {{ gitea_ready.rc | ternary('Up', 'Down') }}
|
||||
- Health: {% if gitea_health.stdout == 'HEALTHY' %}✅ Healthy{% else %}❌ Not Healthy{% endif %}
|
||||
- Configuration: {% if gitea_app_ini_restore.changed %}✅ Restored{% else %}ℹ️ Using default{% endif %}
|
||||
|
||||
Service Discovery:
|
||||
- Gitea in network: {% if gitea_in_network.stdout == 'YES' %}✅{% else %}❌{% endif %}
|
||||
- Direct connection: {% if 'CONNECTION_FAILED' not in traefik_gitea_direct.stdout %}✅{% else %}❌{% endif %}
|
||||
|
||||
Gitea Accessibility:
|
||||
{% if gitea_https_test.status == 200 %}
|
||||
✅ Gitea is reachable via HTTPS (Status: 200)
|
||||
URL: {{ gitea_url }}
|
||||
{% else %}
|
||||
❌ Gitea is NOT reachable via HTTPS (Status: {{ gitea_https_test.status | default('TIMEOUT') }})
|
||||
|
||||
Possible causes:
|
||||
1. SSL certificate is still being generated (wait 2-5 minutes)
|
||||
2. Service discovery needs more time (wait 1-2 minutes)
|
||||
3. Network configuration issue
|
||||
|
||||
Next steps:
|
||||
- Wait 2-5 minutes and test again: curl -k {{ gitea_url }}/api/healthz
|
||||
- Check Traefik logs: cd {{ traefik_stack_path }} && docker compose logs {{ traefik_container_name }} --tail=50
|
||||
- Check Gitea logs: cd {{ gitea_stack_path }} && docker compose logs {{ gitea_container_name }} --tail=50
|
||||
{% endif %}
|
||||
|
||||
{% if not skip_backup %}
|
||||
Backup location: {{ backup_base_path }}/{{ actual_backup_name }}
|
||||
To rollback: ansible-playbook -i inventory/production.yml playbooks/maintenance/rollback-redeploy.yml \
|
||||
--vault-password-file secrets/.vault_pass \
|
||||
-e "backup_name={{ actual_backup_name }}"
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,236 +0,0 @@
|
||||
---
|
||||
# Stabilize Traefik
|
||||
# Stellt sicher, dass Traefik stabil läuft, acme.json korrekt ist und ACME-Challenges durchlaufen
|
||||
- name: Stabilize Traefik
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
|
||||
vars:
|
||||
traefik_stabilize_wait_minutes: "{{ traefik_stabilize_wait_minutes | default(10) }}"
|
||||
traefik_stabilize_check_interval: 60 # Check every 60 seconds
|
||||
|
||||
tasks:
|
||||
- name: Check if Traefik stack directory exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}"
|
||||
register: traefik_stack_exists
|
||||
|
||||
- name: Fail if Traefik stack directory does not exist
|
||||
ansible.builtin.fail:
|
||||
msg: "Traefik stack directory not found at {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}"
|
||||
when: not traefik_stack_exists.stat.exists
|
||||
|
||||
- name: Fix acme.json permissions first
|
||||
ansible.builtin.file:
|
||||
path: "{{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}/acme.json"
|
||||
state: file
|
||||
mode: '0600'
|
||||
owner: "{{ ansible_user | default('deploy') }}"
|
||||
group: "{{ ansible_user | default('deploy') }}"
|
||||
ignore_errors: yes
|
||||
|
||||
- name: Ensure Traefik container is running
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose up -d traefik
|
||||
register: traefik_start
|
||||
changed_when: traefik_start.rc == 0
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
ansible.builtin.wait_for:
|
||||
timeout: 30
|
||||
delay: 2
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik container status
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose ps traefik
|
||||
register: traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Container Status:
|
||||
================================================================================
|
||||
{{ traefik_status.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Check Traefik health
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose exec -T traefik traefik healthcheck --ping 2>&1 || echo "HEALTH_CHECK_FAILED"
|
||||
register: traefik_health
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display Traefik health check
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik Health Check:
|
||||
================================================================================
|
||||
{% if 'HEALTH_CHECK_FAILED' not in traefik_health.stdout %}
|
||||
✅ Traefik is healthy
|
||||
{% else %}
|
||||
⚠️ Traefik health check failed: {{ traefik_health.stdout }}
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Verify acme.json permissions
|
||||
ansible.builtin.stat:
|
||||
path: "{{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}/acme.json"
|
||||
register: acme_json_stat
|
||||
|
||||
- name: Fix acme.json permissions if needed
|
||||
ansible.builtin.file:
|
||||
path: "{{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}/acme.json"
|
||||
mode: '0600'
|
||||
owner: "{{ ansible_user | default('deploy') }}"
|
||||
group: "{{ ansible_user | default('deploy') }}"
|
||||
when: acme_json_stat.stat.mode | string | regex_replace('^0o?', '') != '0600'
|
||||
|
||||
- name: Display acme.json status
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
acme.json Status:
|
||||
================================================================================
|
||||
Path: {{ acme_json_stat.stat.path }}
|
||||
Mode: {{ acme_json_stat.stat.mode | string | regex_replace('^0o?', '') }}
|
||||
{% if acme_json_stat.stat.mode | string | regex_replace('^0o?', '') == '0600' %}
|
||||
✅ acme.json has correct permissions (600)
|
||||
{% else %}
|
||||
⚠️ acme.json permissions need to be fixed
|
||||
{% endif %}
|
||||
================================================================================
|
||||
|
||||
- name: Check Port 80/443 configuration
|
||||
ansible.builtin.shell: |
|
||||
echo "=== Port 80 ==="
|
||||
ss -tlnp 2>/dev/null | grep ":80 " || netstat -tlnp 2>/dev/null | grep ":80 " || echo "Could not check port 80"
|
||||
echo ""
|
||||
echo "=== Port 443 ==="
|
||||
ss -tlnp 2>/dev/null | grep ":443 " || netstat -tlnp 2>/dev/null | grep ":443 " || echo "Could not check port 443"
|
||||
register: port_config_check
|
||||
changed_when: false
|
||||
|
||||
- name: Display Port configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Port-Konfiguration (80/443):
|
||||
================================================================================
|
||||
{{ port_config_check.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Get initial Traefik restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "0"
|
||||
register: initial_restart_count
|
||||
changed_when: false
|
||||
|
||||
- name: Display initial restart count
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Initial Traefik Restart Count: {{ initial_restart_count.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Wait for ACME challenges to complete
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Warte auf ACME-Challenge-Abschluss...
|
||||
================================================================================
|
||||
Warte {{ traefik_stabilize_wait_minutes }} Minuten und prüfe alle {{ traefik_stabilize_check_interval }} Sekunden
|
||||
ob Traefik stabil läuft und keine Restarts auftreten.
|
||||
================================================================================
|
||||
|
||||
- name: Monitor Traefik stability
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose ps traefik --format "{{ '{{' }}.State{{ '}}' }}" | head -1 || echo "UNKNOWN"
|
||||
register: traefik_state_check
|
||||
changed_when: false
|
||||
until: traefik_state_check.stdout == "running"
|
||||
retries: "{{ (traefik_stabilize_wait_minutes | int * 60 / traefik_stabilize_check_interval) | int }}"
|
||||
delay: "{{ traefik_stabilize_check_interval }}"
|
||||
|
||||
- name: Get final Traefik restart count
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "0"
|
||||
register: final_restart_count
|
||||
changed_when: false
|
||||
|
||||
- name: Check for Traefik restarts during monitoring
|
||||
ansible.builtin.set_fact:
|
||||
traefik_restarted: "{{ (final_restart_count.stdout | int) > (initial_restart_count.stdout | int) }}"
|
||||
|
||||
- name: Check Traefik logs for ACME errors
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose logs traefik --since {{ traefik_stabilize_wait_minutes }}m 2>&1 | grep -i "acme\|challenge\|certificate" | tail -20 || echo "No ACME-related messages in logs"
|
||||
register: traefik_acme_logs
|
||||
changed_when: false
|
||||
|
||||
- name: Display Traefik ACME logs
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
Traefik ACME Logs (letzte {{ traefik_stabilize_wait_minutes }} Minuten):
|
||||
================================================================================
|
||||
{{ traefik_acme_logs.stdout }}
|
||||
================================================================================
|
||||
|
||||
- name: Final status check
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path | default('/home/deploy/deployment/stacks/traefik') }}
|
||||
docker compose ps traefik || echo "Could not get final status"
|
||||
register: final_status
|
||||
changed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
ZUSAMMENFASSUNG - Traefik Stabilisierung:
|
||||
================================================================================
|
||||
|
||||
Initial Restart Count: {{ initial_restart_count.stdout }}
|
||||
Final Restart Count: {{ final_restart_count.stdout }}
|
||||
|
||||
{% if traefik_restarted %}
|
||||
⚠️ WARNUNG: Traefik wurde während der Überwachung neu gestartet!
|
||||
Restart Count erhöht sich von {{ initial_restart_count.stdout }} auf {{ final_restart_count.stdout }}
|
||||
|
||||
Nächste Schritte:
|
||||
- Führe diagnose-traefik-restarts.yml aus um die Ursache zu finden
|
||||
- Prüfe Docker-Events und Logs für Restart-Gründe
|
||||
{% else %}
|
||||
✅ Traefik lief stabil während der Überwachung ({{ traefik_stabilize_wait_minutes }} Minuten)
|
||||
Keine Restarts aufgetreten.
|
||||
{% endif %}
|
||||
|
||||
Final Status: {{ final_status.stdout }}
|
||||
|
||||
{% if acme_json_stat.stat.mode | string | regex_replace('^0o?', '') == '0600' %}
|
||||
✅ acme.json hat korrekte Berechtigungen
|
||||
{% else %}
|
||||
⚠️ acme.json Berechtigungen müssen korrigiert werden
|
||||
{% endif %}
|
||||
|
||||
Wichtig:
|
||||
- Traefik muss stabil laufen (keine häufigen Restarts)
|
||||
- Port 80/443 müssen auf Traefik zeigen
|
||||
- acme.json muss beschreibbar sein
|
||||
- ACME-Challenges benötigen 5-10 Minuten um abzuschließen
|
||||
|
||||
Nächste Schritte:
|
||||
- Prüfe Traefik-Logs regelmäßig auf ACME-Fehler
|
||||
- Stelle sicher, dass keine Auto-Restart-Mechanismen aktiv sind
|
||||
- Überwache Traefik für weitere {{ traefik_stabilize_wait_minutes }} Minuten
|
||||
================================================================================
|
||||
@@ -1,73 +0,0 @@
|
||||
---
|
||||
# Test Gitea After Connection Pool Fix
|
||||
- name: Test Gitea After Connection Pool Fix
|
||||
hosts: production
|
||||
gather_facts: no
|
||||
become: no
|
||||
vars:
|
||||
gitea_stack_path: "{{ stacks_base_path }}/gitea"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Test Gitea health endpoint
|
||||
ansible.builtin.uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
validate_certs: false
|
||||
timeout: 35
|
||||
register: gitea_test
|
||||
changed_when: false
|
||||
|
||||
- name: Check Gitea logs for connection pool messages
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose logs gitea --tail 100 | grep -iE "timeout.*authentication|connection.*pool|MAX_OPEN_CONNS|database.*pool" | tail -20 || echo "No connection pool messages found"
|
||||
register: gitea_logs_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Check Postgres logs for authentication timeouts
|
||||
ansible.builtin.shell: |
|
||||
cd {{ gitea_stack_path }}
|
||||
docker compose logs postgres --tail 50 | grep -iE "timeout.*authentication|authentication.*timeout" | tail -10 || echo "No authentication timeout messages found"
|
||||
register: postgres_logs_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Display test results
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
GITEA CONNECTION POOL FIX - TEST RESULTS
|
||||
================================================================================
|
||||
|
||||
Health Check Result:
|
||||
- Status: {{ gitea_test.status | default('TIMEOUT') }}
|
||||
- Response Time: {{ gitea_test.elapsed | default('N/A') }}s
|
||||
{% if gitea_test.status | default(0) == 200 %}
|
||||
✅ Gitea is reachable
|
||||
{% else %}
|
||||
❌ Gitea returned status {{ gitea_test.status | default('TIMEOUT') }}
|
||||
{% endif %}
|
||||
|
||||
Gitea Logs (Connection Pool):
|
||||
{{ gitea_logs_check.stdout }}
|
||||
|
||||
Postgres Logs (Authentication Timeouts):
|
||||
{{ postgres_logs_check.stdout }}
|
||||
|
||||
================================================================================
|
||||
INTERPRETATION:
|
||||
================================================================================
|
||||
|
||||
{% if 'timeout.*authentication' in gitea_logs_check.stdout | lower or 'timeout.*authentication' in postgres_logs_check.stdout | lower %}
|
||||
⚠️ Authentication timeout messages still present
|
||||
→ Connection pool settings may need further tuning
|
||||
→ Consider increasing MAX_OPEN_CONNS or authentication_timeout
|
||||
{% else %}
|
||||
✅ No authentication timeout messages found
|
||||
→ Connection pool fix appears to be working
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -1,82 +0,0 @@
|
||||
---
|
||||
# Ansible Playbook: Update Gitea Traefik Service with Current IP
|
||||
#
|
||||
# ⚠️ DEPRECATED: This playbook is no longer needed since Traefik runs in bridge network mode.
|
||||
# Service discovery via Docker labels works reliably in bridge mode, so manual IP updates
|
||||
# are not required. This playbook is kept for reference only.
|
||||
#
|
||||
# Purpose: Update Traefik dynamic config with current Gitea container IP
|
||||
# Usage:
|
||||
# ansible-playbook -i inventory/production.yml playbooks/update-gitea-traefik-service.yml \
|
||||
# --vault-password-file secrets/.vault_pass
|
||||
|
||||
- name: Update Gitea Traefik Service with Current IP
|
||||
hosts: production
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
gitea_url: "https://{{ gitea_domain }}"
|
||||
|
||||
tasks:
|
||||
- name: Warn that this playbook is deprecated
|
||||
ansible.builtin.fail:
|
||||
msg: |
|
||||
⚠️ This playbook is DEPRECATED and should not be used.
|
||||
Traefik service discovery via Docker labels works reliably in bridge mode.
|
||||
If you really need to run this, set traefik_auto_restart=true explicitly.
|
||||
when: traefik_auto_restart | default(false) | bool == false
|
||||
|
||||
- name: Get current Gitea container IP in traefik-public network
|
||||
shell: |
|
||||
docker inspect gitea | grep -A 10 'traefik-public' | grep IPAddress | head -1 | awk '{print $2}' | tr -d '",'
|
||||
register: gitea_ip
|
||||
changed_when: false
|
||||
|
||||
- name: Display Gitea IP
|
||||
debug:
|
||||
msg: "Gitea container IP: {{ gitea_ip.stdout }}"
|
||||
|
||||
- name: Create Gitea service configuration with current IP
|
||||
copy:
|
||||
dest: "{{ traefik_stack_path }}/dynamic/gitea-service.yml"
|
||||
content: |
|
||||
http:
|
||||
services:
|
||||
gitea:
|
||||
loadBalancer:
|
||||
servers:
|
||||
- url: http://{{ gitea_ip.stdout }}:3000
|
||||
mode: '0644'
|
||||
|
||||
- name: Restart Traefik to load new configuration
|
||||
shell: |
|
||||
docker compose -f {{ traefik_stack_path }}/docker-compose.yml restart traefik
|
||||
when: traefik_auto_restart | default(false) | bool
|
||||
register: traefik_restart
|
||||
changed_when: traefik_restart.rc == 0
|
||||
|
||||
- name: Wait for Traefik to be ready
|
||||
pause:
|
||||
seconds: 10
|
||||
when: traefik_restart.changed | default(false) | bool
|
||||
|
||||
- name: Test Gitea via Traefik
|
||||
uri:
|
||||
url: "{{ gitea_url }}/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: final_test
|
||||
retries: 5
|
||||
delay: 2
|
||||
changed_when: false
|
||||
|
||||
- name: Display result
|
||||
debug:
|
||||
msg: |
|
||||
Gitea-Traefik connection:
|
||||
- Gitea IP: {{ gitea_ip.stdout }}
|
||||
- Via Traefik: {{ 'OK' if final_test.status == 200 else 'FAILED' }}
|
||||
|
||||
Note: This is a temporary fix. The IP will need to be updated if the container restarts.
|
||||
|
||||
@@ -1,143 +0,0 @@
|
||||
---
|
||||
# Verify Traefik Restart Loop Fix
|
||||
# Prüft ob die Änderungen (traefik_auto_restart: false) die Restart-Loops beheben
|
||||
- name: Verify Traefik Restart Loop Fix
|
||||
hosts: production
|
||||
gather_facts: yes
|
||||
become: no
|
||||
vars:
|
||||
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
||||
monitor_duration_minutes: 10 # 10 Minuten Monitoring
|
||||
|
||||
tasks:
|
||||
- name: Display current configuration
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
TRAEFIK RESTART LOOP FIX - VERIFICATION:
|
||||
================================================================================
|
||||
|
||||
Aktuelle Konfiguration:
|
||||
- traefik_auto_restart: {{ traefik_auto_restart | default('NOT SET') }}
|
||||
- traefik_ssl_restart: {{ traefik_ssl_restart | default('NOT SET') }}
|
||||
- gitea_auto_restart: {{ gitea_auto_restart | default('NOT SET') }}
|
||||
|
||||
Erwartetes Verhalten:
|
||||
- Traefik sollte NICHT automatisch nach Config-Deployment neu starten
|
||||
- Traefik sollte NICHT automatisch während SSL-Setup neu starten
|
||||
- Gitea sollte NICHT automatisch bei Healthcheck-Fehlern neu starten
|
||||
|
||||
Monitoring: {{ monitor_duration_minutes }} Minuten
|
||||
================================================================================
|
||||
|
||||
- name: Get initial Traefik status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: initial_traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Get initial Gitea status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect gitea --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: initial_gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik logs for recent restarts
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik --since 1h 2>&1 | grep -iE "stopping server gracefully|I have to go" | wc -l
|
||||
register: recent_restarts
|
||||
changed_when: false
|
||||
|
||||
- name: Wait for monitoring period
|
||||
ansible.builtin.pause:
|
||||
minutes: "{{ monitor_duration_minutes }}"
|
||||
|
||||
- name: Get final Traefik status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: final_traefik_status
|
||||
changed_when: false
|
||||
|
||||
- name: Get final Gitea status
|
||||
ansible.builtin.shell: |
|
||||
docker inspect gitea --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
||||
register: final_gitea_status
|
||||
changed_when: false
|
||||
|
||||
- name: Check Traefik logs for restarts during monitoring
|
||||
ansible.builtin.shell: |
|
||||
cd {{ traefik_stack_path }}
|
||||
docker compose logs traefik --since {{ monitor_duration_minutes }}m 2>&1 | grep -iE "stopping server gracefully|I have to go" || echo "Keine Restarts gefunden"
|
||||
register: restarts_during_monitoring
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Test Gitea accessibility (multiple attempts)
|
||||
ansible.builtin.uri:
|
||||
url: "https://git.michaelschiemer.de/api/healthz"
|
||||
method: GET
|
||||
status_code: [200]
|
||||
validate_certs: false
|
||||
timeout: 10
|
||||
register: gitea_test
|
||||
until: gitea_test.status == 200
|
||||
retries: 5
|
||||
delay: 2
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: Summary
|
||||
ansible.builtin.debug:
|
||||
msg: |
|
||||
================================================================================
|
||||
VERIFICATION SUMMARY:
|
||||
================================================================================
|
||||
|
||||
Initial Status:
|
||||
- Traefik: {{ initial_traefik_status.stdout }}
|
||||
- Gitea: {{ initial_gitea_status.stdout }}
|
||||
|
||||
Final Status:
|
||||
- Traefik: {{ final_traefik_status.stdout }}
|
||||
- Gitea: {{ final_gitea_status.stdout }}
|
||||
|
||||
Restarts während Monitoring ({{ monitor_duration_minutes }} Minuten):
|
||||
{% if restarts_during_monitoring.stdout and 'Keine Restarts' not in restarts_during_monitoring.stdout %}
|
||||
❌ RESTARTS GEFUNDEN:
|
||||
{{ restarts_during_monitoring.stdout }}
|
||||
|
||||
⚠️ PROBLEM: Traefik wurde während des Monitorings gestoppt!
|
||||
→ Die Änderungen haben das Problem noch nicht vollständig behoben
|
||||
→ Prüfe ob externe Ansible-Playbooks noch laufen
|
||||
→ Prüfe ob andere Automatisierungen Traefik stoppen
|
||||
{% else %}
|
||||
✅ KEINE RESTARTS GEFUNDEN
|
||||
|
||||
Traefik lief stabil während des {{ monitor_duration_minutes }}-minütigen Monitorings!
|
||||
→ Die Änderungen scheinen zu funktionieren
|
||||
{% endif %}
|
||||
|
||||
Gitea Accessibility:
|
||||
{% if gitea_test.status == 200 %}
|
||||
✅ Gitea ist erreichbar (Status: 200)
|
||||
{% else %}
|
||||
❌ Gitea ist nicht erreichbar (Status: {{ gitea_test.status | default('TIMEOUT') }})
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
NÄCHSTE SCHRITTE:
|
||||
================================================================================
|
||||
|
||||
{% if restarts_during_monitoring.stdout and 'Keine Restarts' not in restarts_during_monitoring.stdout %}
|
||||
1. ❌ Prüfe externe Ansible-Playbooks die noch laufen könnten
|
||||
2. ❌ Prüfe CI/CD-Pipelines die Traefik restarten könnten
|
||||
3. ❌ Führe 'find-ansible-automation-source.yml' erneut aus
|
||||
{% else %}
|
||||
1. ✅ Traefik läuft stabil - keine automatischen Restarts mehr
|
||||
2. ✅ Überwache Traefik weiterhin für 1-2 Stunden um sicherzugehen
|
||||
3. ✅ Teste Gitea im Browser: https://git.michaelschiemer.de
|
||||
{% endif %}
|
||||
|
||||
================================================================================
|
||||
|
||||
@@ -23,6 +23,10 @@ DOMAIN = {{ gitea_domain }}
|
||||
HTTP_ADDR = 0.0.0.0
|
||||
HTTP_PORT = 3000
|
||||
ROOT_URL = https://{{ gitea_domain }}/
|
||||
# LOCAL_ROOT_URL for internal access (Runner/Webhooks)
|
||||
LOCAL_ROOT_URL = http://gitea:3000/
|
||||
# Trust Traefik proxy (Docker network: 172.18.0.0/16)
|
||||
PROXY_TRUSTED_PROXIES = 172.18.0.0/16,::1,127.0.0.1
|
||||
DISABLE_SSH = false
|
||||
START_SSH_SERVER = false
|
||||
SSH_DOMAIN = {{ gitea_domain }}
|
||||
@@ -68,7 +72,11 @@ HOST = redis://:{{ redis_password }}@redis:6379/0?pool_size=100&idle_timeout=180
|
||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||
[session]
|
||||
PROVIDER = redis
|
||||
PROVIDER_CONFIG = network=tcp,addr=redis:6379,password={{ redis_password }},db=0,pool_size=100,idle_timeout=180
|
||||
# PROVIDER_CONFIG must be a Redis connection string (as per Gitea documentation)
|
||||
# Format: redis://:password@host:port/db?pool_size=100&idle_timeout=180s
|
||||
# Using same format as cache HOST and queue CONN_STR for consistency
|
||||
PROVIDER_CONFIG = redis://:{{ redis_password }}@redis:6379/0?pool_size=100&idle_timeout=180s
|
||||
SAME_SITE = lax
|
||||
|
||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||
;; Queue Configuration (Redis)
|
||||
@@ -82,6 +90,8 @@ CONN_STR = redis://:{{ redis_password }}@redis:6379/0
|
||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||
[security]
|
||||
INSTALL_LOCK = true
|
||||
# Cookie security (only if ROOT_URL is https)
|
||||
COOKIE_SECURE = true
|
||||
|
||||
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
|
||||
;; Service Configuration
|
||||
|
||||
@@ -218,3 +218,4 @@ ansible-playbook -i inventory/production.yml \
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -37,8 +37,16 @@ services:
|
||||
- "traefik.http.routers.gitea.priority=100"
|
||||
# Service configuration (Docker provider uses port, not url)
|
||||
- "traefik.http.services.gitea.loadbalancer.server.port=3000"
|
||||
# Middleware chain (removed temporarily to test if it causes issues)
|
||||
# - "traefik.http.routers.gitea.middlewares=security-headers-global@file,gzip-compression@file"
|
||||
# ServersTransport for longer timeouts (prevents 504 for SSE/Long-Polling like /user/events)
|
||||
# Temporarily removed to test if this is causing the service discovery issue
|
||||
# - "traefik.http.services.gitea.loadbalancer.serversTransport=gitea-transport@docker"
|
||||
# - "traefik.http.serverstransports.gitea-transport.forwardingtimeouts.dialtimeout=10s"
|
||||
# - "traefik.http.serverstransports.gitea-transport.forwardingtimeouts.responseheadertimeout=120s"
|
||||
# - "traefik.http.serverstransports.gitea-transport.forwardingtimeouts.idleconntimeout=180s"
|
||||
# - "traefik.http.serverstransports.gitea-transport.maxidleconnsperhost=100"
|
||||
# X-Forwarded-Proto header (helps with redirects/cookies)
|
||||
- "traefik.http.middlewares.gitea-headers.headers.customrequestheaders.X-Forwarded-Proto=https"
|
||||
- "traefik.http.routers.gitea.middlewares=gitea-headers@docker"
|
||||
# Explicitly reference the service (like MinIO does)
|
||||
- "traefik.http.routers.gitea.service=gitea"
|
||||
healthcheck:
|
||||
|
||||
Reference in New Issue
Block a user