Some checks failed
Security Vulnerability Scan / Check for Dependency Changes (push) Successful in 29s
Security Vulnerability Scan / Composer Security Audit (push) Has been skipped
🚀 Build & Deploy Image / Determine Build Necessity (push) Failing after 11m3s
🚀 Build & Deploy Image / Build Runtime Base Image (push) Has been cancelled
🚀 Build & Deploy Image / Run Tests & Quality Checks (push) Has been cancelled
🚀 Build & Deploy Image / Build Docker Image (push) Has been cancelled
🚀 Build & Deploy Image / Auto-deploy to Staging (push) Has been cancelled
🚀 Build & Deploy Image / Auto-deploy to Production (push) Has been cancelled
- Set traefik_auto_restart: false in group_vars to prevent automatic restarts after config deployment - Set traefik_ssl_restart: false to prevent automatic restarts during SSL certificate setup - Set gitea_auto_restart: false to prevent automatic restarts when healthcheck fails - Modify traefik/tasks/ssl.yml to only restart if explicitly requested or acme.json was created - Modify traefik/tasks/config.yml to respect traefik_auto_restart flag - Modify gitea/tasks/restart.yml to respect gitea_auto_restart flag - Add verify-traefik-fix.yml playbook to monitor Traefik stability This fixes the issue where Traefik was restarting every minute due to automatic restart mechanisms triggered by config deployments and health checks. The restart loops caused 504 Gateway Timeouts for Gitea and other services. Fixes: Traefik restart loop causing service unavailability
144 lines
6.3 KiB
YAML
144 lines
6.3 KiB
YAML
---
|
|
# Verify Traefik Restart Loop Fix
|
|
# Prüft ob die Änderungen (traefik_auto_restart: false) die Restart-Loops beheben
|
|
- name: Verify Traefik Restart Loop Fix
|
|
hosts: production
|
|
gather_facts: yes
|
|
become: no
|
|
vars:
|
|
traefik_stack_path: "{{ stacks_base_path }}/traefik"
|
|
monitor_duration_minutes: 10 # 10 Minuten Monitoring
|
|
|
|
tasks:
|
|
- name: Display current configuration
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
================================================================================
|
|
TRAEFIK RESTART LOOP FIX - VERIFICATION:
|
|
================================================================================
|
|
|
|
Aktuelle Konfiguration:
|
|
- traefik_auto_restart: {{ traefik_auto_restart | default('NOT SET') }}
|
|
- traefik_ssl_restart: {{ traefik_ssl_restart | default('NOT SET') }}
|
|
- gitea_auto_restart: {{ gitea_auto_restart | default('NOT SET') }}
|
|
|
|
Erwartetes Verhalten:
|
|
- Traefik sollte NICHT automatisch nach Config-Deployment neu starten
|
|
- Traefik sollte NICHT automatisch während SSL-Setup neu starten
|
|
- Gitea sollte NICHT automatisch bei Healthcheck-Fehlern neu starten
|
|
|
|
Monitoring: {{ monitor_duration_minutes }} Minuten
|
|
================================================================================
|
|
|
|
- name: Get initial Traefik status
|
|
ansible.builtin.shell: |
|
|
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
|
register: initial_traefik_status
|
|
changed_when: false
|
|
|
|
- name: Get initial Gitea status
|
|
ansible.builtin.shell: |
|
|
docker inspect gitea --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
|
register: initial_gitea_status
|
|
changed_when: false
|
|
|
|
- name: Check Traefik logs for recent restarts
|
|
ansible.builtin.shell: |
|
|
cd {{ traefik_stack_path }}
|
|
docker compose logs traefik --since 1h 2>&1 | grep -iE "stopping server gracefully|I have to go" | wc -l
|
|
register: recent_restarts
|
|
changed_when: false
|
|
|
|
- name: Wait for monitoring period
|
|
ansible.builtin.pause:
|
|
minutes: "{{ monitor_duration_minutes }}"
|
|
|
|
- name: Get final Traefik status
|
|
ansible.builtin.shell: |
|
|
docker inspect traefik --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
|
register: final_traefik_status
|
|
changed_when: false
|
|
|
|
- name: Get final Gitea status
|
|
ansible.builtin.shell: |
|
|
docker inspect gitea --format '{{ '{{' }}.State.Status{{ '}}' }}|{{ '{{' }}.State.StartedAt{{ '}}' }}|{{ '{{' }}.RestartCount{{ '}}' }}' 2>/dev/null || echo "UNKNOWN"
|
|
register: final_gitea_status
|
|
changed_when: false
|
|
|
|
- name: Check Traefik logs for restarts during monitoring
|
|
ansible.builtin.shell: |
|
|
cd {{ traefik_stack_path }}
|
|
docker compose logs traefik --since {{ monitor_duration_minutes }}m 2>&1 | grep -iE "stopping server gracefully|I have to go" || echo "Keine Restarts gefunden"
|
|
register: restarts_during_monitoring
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Test Gitea accessibility (multiple attempts)
|
|
ansible.builtin.uri:
|
|
url: "https://git.michaelschiemer.de/api/healthz"
|
|
method: GET
|
|
status_code: [200]
|
|
validate_certs: false
|
|
timeout: 10
|
|
register: gitea_test
|
|
until: gitea_test.status == 200
|
|
retries: 5
|
|
delay: 2
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Summary
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
================================================================================
|
|
VERIFICATION SUMMARY:
|
|
================================================================================
|
|
|
|
Initial Status:
|
|
- Traefik: {{ initial_traefik_status.stdout }}
|
|
- Gitea: {{ initial_gitea_status.stdout }}
|
|
|
|
Final Status:
|
|
- Traefik: {{ final_traefik_status.stdout }}
|
|
- Gitea: {{ final_gitea_status.stdout }}
|
|
|
|
Restarts während Monitoring ({{ monitor_duration_minutes }} Minuten):
|
|
{% if restarts_during_monitoring.stdout and 'Keine Restarts' not in restarts_during_monitoring.stdout %}
|
|
❌ RESTARTS GEFUNDEN:
|
|
{{ restarts_during_monitoring.stdout }}
|
|
|
|
⚠️ PROBLEM: Traefik wurde während des Monitorings gestoppt!
|
|
→ Die Änderungen haben das Problem noch nicht vollständig behoben
|
|
→ Prüfe ob externe Ansible-Playbooks noch laufen
|
|
→ Prüfe ob andere Automatisierungen Traefik stoppen
|
|
{% else %}
|
|
✅ KEINE RESTARTS GEFUNDEN
|
|
|
|
Traefik lief stabil während des {{ monitor_duration_minutes }}-minütigen Monitorings!
|
|
→ Die Änderungen scheinen zu funktionieren
|
|
{% endif %}
|
|
|
|
Gitea Accessibility:
|
|
{% if gitea_test.status == 200 %}
|
|
✅ Gitea ist erreichbar (Status: 200)
|
|
{% else %}
|
|
❌ Gitea ist nicht erreichbar (Status: {{ gitea_test.status | default('TIMEOUT') }})
|
|
{% endif %}
|
|
|
|
================================================================================
|
|
NÄCHSTE SCHRITTE:
|
|
================================================================================
|
|
|
|
{% if restarts_during_monitoring.stdout and 'Keine Restarts' not in restarts_during_monitoring.stdout %}
|
|
1. ❌ Prüfe externe Ansible-Playbooks die noch laufen könnten
|
|
2. ❌ Prüfe CI/CD-Pipelines die Traefik restarten könnten
|
|
3. ❌ Führe 'find-ansible-automation-source.yml' erneut aus
|
|
{% else %}
|
|
1. ✅ Traefik läuft stabil - keine automatischen Restarts mehr
|
|
2. ✅ Überwache Traefik weiterhin für 1-2 Stunden um sicherzugehen
|
|
3. ✅ Teste Gitea im Browser: https://git.michaelschiemer.de
|
|
{% endif %}
|
|
|
|
================================================================================
|
|
|