# Legacy Deployment Architecture Analysis **Created**: 2025-01-24 **Status**: Archived - System being redesigned ## Executive Summary This document analyzes the existing deployment architecture that led to the decision to rebuild from scratch. ## Discovered Issues ### 1. Docker Swarm vs Docker Compose Confusion **Problem**: System designed for Docker Swarm but running with Docker Compose - Stack files reference Swarm features (secrets, configs) - Docker Swarm not initialized on target server - Local development uses Docker Compose - Production deployment unclear which to use **Impact**: Container startup failures, service discovery issues ### 2. Distributed Stack Files **Current Structure**: ``` deployment/stacks/ ├── traefik/ # Reverse proxy ├── postgresql-production/ ├── postgresql-staging/ ├── gitea/ # Git server ├── redis/ ├── minio/ ├── monitoring/ ├── registry/ └── semaphore/ ``` **Problems**: - No clear dependency graph between stacks - Unclear startup order - Volume mounts across stacks - Network configuration scattered ### 3. Ansible Deployment Confusion **Ansible Usage**: - Server provisioning (install-docker.yml) - Application deployment (sync-application-code.yml) - Container recreation (recreate-containers-with-env.yml) - Stack synchronization (sync-stacks.yml) **Problem**: Ansible used for BOTH provisioning AND deployment - Should only provision servers - Deployment should be via CI/CD - Creates unclear responsibilities ### 4. Environment-Specific Issues **Environments Identified**: - `local` - Developer machines (Docker Compose) - `staging` - Hetzner server (unclear Docker Compose vs Swarm) - `production` - Hetzner server (unclear Docker Compose vs Swarm) **Problems**: - No unified docker-compose files per environment - Environment variables scattered (.env, secrets, Ansible vars) - SSL certificates managed differently per environment ### 5. Specific Container Failures **postgres-production-backup**: - Container doesn't exist (was in restart loop) - Volume mounts not accessible: `/scripts/backup-entrypoint.sh` - Exit code 255 (file not found) - Restart policy causing loop **Root Causes**: - Relative volume paths in docker-compose.yml - Container running from different working directory - Stack not properly initialized ### 6. Network Architecture Unclear **Networks Found**: - `traefik-public` (external) - `app-internal` (external, for PostgreSQL) - `backend`, `cache`, `postgres-production-internal` **Problems**: - Which stacks share which networks? - How do services discover each other? - Traefik routing configuration scattered ## Architecture Diagram (Current State) ``` ┌─────────────────────────────────────────────────────────────┐ │ Server (Docker Compose? Docker Swarm? Unclear) │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Traefik │───▶│ App │───▶│ PostgreSQL │ │ │ │ Stack │ │ Stack │ │ Stack │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ ┌──────▼──────┐ ┌───────▼────┐ ┌──────────▼─────┐ │ │ │ Gitea │ │ Redis │ │ MinIO │ │ │ │ Stack │ │ Stack │ │ Stack │ │ │ └─────────────┘ └────────────┘ └────────────────┘ │ │ │ │ Networks: traefik-public, app-internal, backend, cache │ │ Volumes: Relative paths, absolute paths, mixed │ │ Secrets: Docker secrets (Swarm), .env files, Ansible vars│ └─────────────────────────────────────────────────────────────┘ ▲ │ Deployment via? │ - docker-compose up? │ - docker stack deploy? │ - Ansible playbooks? │ UNCLEAR │ ┌───┴────────────────────────────────────────────────┐ │ Developer Machine / CI/CD (Gitea) │ │ - Ansible playbooks in deployment/ansible/ │ │ - Stack files in deployment/stacks/ │ │ - Application code in src/ │ └─────────────────────────────────────────────────────┘ ``` ## Decision Rationale: Rebuild vs Repair ### Why Rebuild? 1. **Architectural Clarity**: Current system mixes concepts (Swarm/Compose, provisioning/deployment) 2. **Environment Separation**: Clean separation of local/staging/prod configurations 3. **CI/CD Integration**: Design for Gitea Actions from start 4. **Maintainability**: Single source of truth per environment 5. **Debugging Difficulty**: Current issues are symptoms of architectural problems ### What to Keep? - ✅ Traefik configuration (reverse proxy setup is solid) - ✅ PostgreSQL backup scripts (logic is good, just needs proper mounting) - ✅ SSL certificate generation (Let's Encrypt integration works) - ✅ Ansible server provisioning playbooks (keep for initial setup) ### What to Redesign? - ❌ Stack organization (too fragmented) - ❌ Deployment method (unclear Ansible vs CI/CD) - ❌ Environment configuration (scattered variables) - ❌ Volume mount strategy (relative paths causing issues) - ❌ Network architecture (unclear dependencies) ## Lessons Learned 1. **Consistency is Key**: Choose Docker Compose OR Docker Swarm, not both 2. **Environment Files**: One docker-compose.{env}.yml per environment 3. **Ansible Scope**: Only for server provisioning, NOT deployment 4. **CI/CD First**: Gitea Actions should handle deployment 5. **Volume Paths**: Always use absolute paths or named volumes 6. **Network Clarity**: Explicit network definitions, clear service discovery ## Next Steps See `deployment/NEW_ARCHITECTURE.md` for the redesigned system. ## Archive Contents This `deployment/legacy/` directory contains: - Original stack files (archived) - Ansible playbooks (reference only) - This analysis document **DO NOT USE THESE FILES FOR NEW DEPLOYMENTS**