# GitLab + Harbor + Docker Swarm: Automated Deployment Solution **Версия:** 1.0 **Дата создания:** Январь 2026 **Статус:** Implementation Ready **Целевая аудитория:** DevOps Team, Development Team --- ## Executive Summary Данный документ описывает практическое решение для автоматизации deployment процесса в существующей инфраструктуре: **Текущая ситуация:** - ✅ GitLab уже установлен - ✅ Harbor Registry уже работает - ✅ Docker Swarm с несколькими контейнерами - ✅ 4 окружения: Development → Sandbox → Testing → Production - ❌ Ручной deployment через bash скрипты - ❌ Нет процесса code review - ❌ Нет автоматического rollback - ❌ Получаем готовые images из Harbor без visibility **Предлагаемое решение:** - GitLab CI/CD pipelines для автоматического deployment - GitOps подход: Git как source of truth для deployments - Автоматический deployment по средам с approval gates - One-click rollback capability - Deployment history и audit trail - Health checks и автоматический rollback при failure **Результаты внедрения:** - 🚀 Deployment time: с 30-60 минут → 5-10 минут - 🔒 Human errors: reduction на 90% - 📊 Full visibility: кто, что, когда deployed - ⚡ Rollback: с 1-2 часов → 2-3 минуты - ✅ Compliance: полный audit trail --- ## Содержание 1. [Архитектура решения](#1-архитектура-решения) 2. [GitLab CI/CD Pipeline Implementation](#2-gitlab-cicd-pipeline-implementation) 3. [Docker Stack Management](#3-docker-stack-management) 4. [Environment Management Strategy](#4-environment-management-strategy) 5. [Rollback Strategy](#5-rollback-strategy) 6. [Monitoring & Health Checks](#6-monitoring--health-checks) 7. [Implementation Roadmap](#7-implementation-roadmap) 8. [Best Practices](#8-best-practices) --- ## 1. Архитектура решения ### 1.1 Current State Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Current Manual Process │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Developer → Build Image → Push to Harbor │ │ ↓ │ │ Notify DevOps Team │ │ ↓ │ │ DevOps manually runs bash scripts: │ │ │ │ 1. SSH to Swarm manager │ │ 2. docker service update app --image harbor/app:new-tag │ │ 3. Check logs manually │ │ 4. Hope everything works │ │ 5. Repeat for each environment (4x) │ │ │ │ Problems: │ │ • Time consuming (30-60 min per environment) │ │ • Error prone (typos, wrong tags) │ │ • No rollback plan │ │ • No audit trail │ │ • No validation before deployment │ └─────────────────────────────────────────────────────────────┘ ``` ### 1.2 Target Automated Architecture ``` ┌──────────────────────────────────────────────────────────────┐ │ Automated GitOps-Based Solution │ ├──────────────────────────────────────────────────────────────┤ │ │ │ Developer pushes image tag change to Git │ │ ↓ │ │ GitLab CI/CD Pipeline automatically: │ │ ↓ │ │ ┌─────────────────────────────────────────────────┐ │ │ │ 1. Validate docker-compose.yml syntax │ │ │ │ 2. Check image exists in Harbor │ │ │ │ 3. Deploy to Development (automatic) │ │ │ │ 4. Run health checks │ │ │ │ 5. Wait for manual approval → Sandbox │ │ │ │ 6. Deploy to Sandbox │ │ │ │ 7. Wait for manual approval → Testing │ │ │ │ 8. Deploy to Testing │ │ │ │ 9. Wait for manual approval → Production │ │ │ │ 10. Deploy to Production │ │ │ │ 11. Monitor deployment success │ │ │ │ 12. Auto-rollback if health checks fail │ │ │ └─────────────────────────────────────────────────┘ │ │ │ │ Benefits: │ │ ✅ 5-10 minutes per environment │ │ ✅ Zero human errors │ │ ✅ Automatic rollback on failure │ │ ✅ Complete audit trail in Git │ │ ✅ Pre-deployment validation │ └──────────────────────────────────────────────────────────────┘ ``` ### 1.3 Git Repository Structure ``` deployment-configs/ # New GitLab repository ├── README.md ├── .gitlab-ci.yml # CI/CD pipeline definition │ ├── environments/ │ ├── development/ │ │ ├── docker-compose.yml │ │ ├── .env │ │ └── healthcheck.sh │ │ │ ├── sandbox/ │ │ ├── docker-compose.yml │ │ ├── .env │ │ └── healthcheck.sh │ │ │ ├── testing/ │ │ ├── docker-compose.yml │ │ ├── .env │ │ └── healthcheck.sh │ │ │ └── production/ │ ├── docker-compose.yml │ ├── .env │ └── healthcheck.sh │ ├── scripts/ │ ├── deploy.sh # Deployment script │ ├── rollback.sh # Rollback script │ ├── healthcheck.sh # Health validation │ └── validate-compose.sh # Pre-deployment validation │ └── docs/ ├── deployment-guide.md └── rollback-procedure.md ``` --- ## 2. GitLab CI/CD Pipeline Implementation ### 2.1 Complete .gitlab-ci.yml ```yaml # .gitlab-ci.yml - Complete automated deployment pipeline variables: DOCKER_HOST: "tcp://docker-swarm-manager:2376" DOCKER_TLS_VERIFY: "1" HARBOR_REGISTRY: "harbor.company.com" # Swarm connection details (stored in GitLab CI/CD variables) # SWARM_DEV_HOST, SWARM_SANDBOX_HOST, SWARM_TEST_HOST, SWARM_PROD_HOST # SWARM_SSH_KEY (SSH private key for authentication) stages: - validate - deploy-dev - deploy-sandbox - deploy-testing - deploy-production - rollback #═══════════════════════════════════════════════════════════ # Stage 1: VALIDATION #═══════════════════════════════════════════════════════════ validate:syntax: stage: validate image: docker:24-cli script: - echo "Validating docker-compose files..." - | for env in development sandbox testing production; do echo "Checking $env environment..." docker-compose -f environments/$env/docker-compose.yml config > /dev/null if [ $? -eq 0 ]; then echo "✅ $env: Syntax OK" else echo "❌ $env: Syntax ERROR" exit 1 fi done only: - branches tags: - docker validate:images: stage: validate image: docker:24-cli before_script: - docker login -u $HARBOR_USER -p $HARBOR_PASSWORD $HARBOR_REGISTRY script: - echo "Checking if images exist in Harbor..." - | for env in development sandbox testing production; do echo "Checking images for $env..." # Extract image tags from docker-compose images=$(grep "image:" environments/$env/docker-compose.yml | awk '{print $2}') for image in $images; do echo "Pulling $image to verify existence..." docker pull $image if [ $? -eq 0 ]; then echo "✅ Image exists: $image" else echo "❌ Image NOT found: $image" exit 1 fi done done only: - branches tags: - docker #═══════════════════════════════════════════════════════════ # Stage 2: DEPLOY TO DEVELOPMENT (Automatic) #═══════════════════════════════════════════════════════════ deploy:development: stage: deploy-dev image: alpine:latest before_script: - apk add --no-cache openssh-client bash docker-cli - eval $(ssh-agent -s) - echo "$SWARM_SSH_KEY" | tr -d '\r' | ssh-add - - mkdir -p ~/.ssh - chmod 700 ~/.ssh - ssh-keyscan -H $SWARM_DEV_HOST >> ~/.ssh/known_hosts script: - echo "🚀 Deploying to DEVELOPMENT environment..." # Copy files to swarm manager - scp -r environments/development root@$SWARM_DEV_HOST:/tmp/deploy/ - scp scripts/deploy.sh root@$SWARM_DEV_HOST:/tmp/deploy/ # Execute deployment - | ssh root@$SWARM_DEV_HOST bash << 'EOF' cd /tmp/deploy/development # Load environment variables source .env # Deploy stack docker stack deploy -c docker-compose.yml --with-registry-auth app-stack # Wait for services to stabilize echo "Waiting for services to start..." sleep 30 # Check service status docker stack services app-stack # Run health checks bash ../healthcheck.sh EOF - echo "✅ Deployment to DEVELOPMENT completed" environment: name: development url: https://dev.company.com on_stop: stop:development only: - main - develop tags: - deployment #═══════════════════════════════════════════════════════════ # Stage 3: DEPLOY TO SANDBOX (Manual Approval Required) #═══════════════════════════════════════════════════════════ deploy:sandbox: stage: deploy-sandbox image: alpine:latest before_script: - apk add --no-cache openssh-client bash docker-cli - eval $(ssh-agent -s) - echo "$SWARM_SSH_KEY" | tr -d '\r' | ssh-add - - mkdir -p ~/.ssh - chmod 700 ~/.ssh - ssh-keyscan -H $SWARM_SANDBOX_HOST >> ~/.ssh/known_hosts script: - echo "🚀 Deploying to SANDBOX environment..." - scp -r environments/sandbox root@$SWARM_SANDBOX_HOST:/tmp/deploy/ - | ssh root@$SWARM_SANDBOX_HOST bash << 'EOF' cd /tmp/deploy/sandbox source .env docker stack deploy -c docker-compose.yml --with-registry-auth app-stack sleep 30 docker stack services app-stack bash ../healthcheck.sh EOF - echo "✅ Deployment to SANDBOX completed" environment: name: sandbox url: https://sandbox.company.com when: manual # ⚠️ Requires manual approval only: - main tags: - deployment #═══════════════════════════════════════════════════════════ # Stage 4: DEPLOY TO TESTING (Manual Approval Required) #═══════════════════════════════════════════════════════════ deploy:testing: stage: deploy-testing image: alpine:latest before_script: - apk add --no-cache openssh-client bash docker-cli - eval $(ssh-agent -s) - echo "$SWARM_SSH_KEY" | tr -d '\r' | ssh-add - - mkdir -p ~/.ssh - chmod 700 ~/.ssh - ssh-keyscan -H $SWARM_TEST_HOST >> ~/.ssh/known_hosts script: - echo "🚀 Deploying to TESTING environment..." - scp -r environments/testing root@$SWARM_TEST_HOST:/tmp/deploy/ - | ssh root@$SWARM_TEST_HOST bash << 'EOF' cd /tmp/deploy/testing source .env docker stack deploy -c docker-compose.yml --with-registry-auth app-stack sleep 30 docker stack services app-stack bash ../healthcheck.sh EOF - echo "✅ Deployment to TESTING completed" environment: name: testing url: https://testing.company.com when: manual # ⚠️ Requires manual approval only: - main tags: - deployment #═══════════════════════════════════════════════════════════ # Stage 5: DEPLOY TO PRODUCTION (Manual Approval Required) #═══════════════════════════════════════════════════════════ deploy:production: stage: deploy-production image: alpine:latest before_script: - apk add --no-cache openssh-client bash docker-cli - eval $(ssh-agent -s) - echo "$SWARM_SSH_KEY" | tr -d '\r' | ssh-add - - mkdir -p ~/.ssh - chmod 700 ~/.ssh - ssh-keyscan -H $SWARM_PROD_HOST >> ~/.ssh/known_hosts script: - echo "🚀 Deploying to PRODUCTION environment..." # Backup current deployment - | ssh root@$SWARM_PROD_HOST bash << 'EOF' echo "Creating backup of current deployment..." mkdir -p /backup/deployments/$(date +%Y%m%d-%H%M%S) docker stack services app-stack --format "{{.Name}} {{.Image}}" > /backup/deployments/$(date +%Y%m%d-%H%M%S)/services.txt echo "Backup created" EOF # Deploy new version - scp -r environments/production root@$SWARM_PROD_HOST:/tmp/deploy/ - | ssh root@$SWARM_PROD_HOST bash << 'EOF' cd /tmp/deploy/production source .env echo "Starting production deployment..." docker stack deploy -c docker-compose.yml --with-registry-auth app-stack echo "Waiting for services to stabilize..." sleep 60 echo "Checking service health..." docker stack services app-stack # Run comprehensive health checks bash ../healthcheck.sh if [ $? -eq 0 ]; then echo "✅ Health checks PASSED" else echo "❌ Health checks FAILED - consider rollback" exit 1 fi EOF - echo "✅ Deployment to PRODUCTION completed successfully" environment: name: production url: https://app.company.com when: manual # ⚠️ Requires manual approval + confirmation only: - main tags: - deployment #═══════════════════════════════════════════════════════════ # ROLLBACK JOBS (Manual Trigger) #═══════════════════════════════════════════════════════════ rollback:production: stage: rollback image: alpine:latest before_script: - apk add --no-cache openssh-client bash docker-cli git - eval $(ssh-agent -s) - echo "$SWARM_SSH_KEY" | tr -d '\r' | ssh-add - - mkdir -p ~/.ssh - chmod 700 ~/.ssh - ssh-keyscan -H $SWARM_PROD_HOST >> ~/.ssh/known_hosts script: - echo "🔄 Rolling back PRODUCTION to previous version..." # Get previous Git commit - PREVIOUS_COMMIT=$(git rev-parse HEAD~1) - echo "Rolling back to commit: $PREVIOUS_COMMIT" # Checkout previous version - git checkout $PREVIOUS_COMMIT -- environments/production/ # Deploy previous version - scp -r environments/production root@$SWARM_PROD_HOST:/tmp/rollback/ - | ssh root@$SWARM_PROD_HOST bash << 'EOF' cd /tmp/rollback/production source .env echo "Rolling back to previous version..." docker stack deploy -c docker-compose.yml --with-registry-auth app-stack sleep 30 echo "Verifying rollback..." docker stack services app-stack bash ../healthcheck.sh EOF - echo "✅ Rollback completed" environment: name: production action: rollback when: manual only: - main tags: - deployment ``` --- ## 3. Docker Stack Management ### 3.1 Example docker-compose.yml Structure ```yaml # environments/production/docker-compose.yml version: '3.8' services: #════════════════════════════════════════════════════════ # Frontend Application #════════════════════════════════════════════════════════ frontend: image: ${HARBOR_REGISTRY}/company/frontend:${FRONTEND_VERSION} networks: - app-network ports: - "80:80" - "443:443" deploy: replicas: 3 update_config: parallelism: 1 delay: 10s failure_action: rollback monitor: 30s rollback_config: parallelism: 1 delay: 5s restart_policy: condition: any delay: 5s max_attempts: 3 placement: constraints: - node.role == worker healthcheck: test: ["CMD", "curl", "-f", "http://localhost/health"] interval: 30s timeout: 10s retries: 3 start_period: 40s logging: driver: "json-file" options: max-size: "10m" max-file: "3" #════════════════════════════════════════════════════════ # Backend API #════════════════════════════════════════════════════════ api: image: ${HARBOR_REGISTRY}/company/api:${API_VERSION} networks: - app-network - db-network environment: - DATABASE_URL=${DATABASE_URL} - REDIS_URL=${REDIS_URL} - JWT_SECRET=${JWT_SECRET} secrets: - db_password - jwt_secret deploy: replicas: 5 update_config: parallelism: 2 delay: 10s failure_action: rollback monitor: 45s rollback_config: parallelism: 2 delay: 5s restart_policy: condition: any delay: 5s max_attempts: 3 placement: constraints: - node.role == worker healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/health"] interval: 30s timeout: 10s retries: 3 start_period: 60s logging: driver: "json-file" options: max-size: "10m" max-file: "3" #════════════════════════════════════════════════════════ # Worker Service #════════════════════════════════════════════════════════ worker: image: ${HARBOR_REGISTRY}/company/worker:${WORKER_VERSION} networks: - app-network - db-network environment: - REDIS_URL=${REDIS_URL} - QUEUE_NAME=jobs deploy: replicas: 3 update_config: parallelism: 1 delay: 10s failure_action: rollback restart_policy: condition: any delay: 10s max_attempts: 3 placement: constraints: - node.role == worker logging: driver: "json-file" options: max-size: "10m" max-file: "3" #════════════════════════════════════════════════════════ # Cache (Redis) #════════════════════════════════════════════════════════ redis: image: redis:7-alpine networks: - app-network deploy: replicas: 1 placement: constraints: - node.role == worker restart_policy: condition: any healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 10s timeout: 3s retries: 3 logging: driver: "json-file" options: max-size: "10m" max-file: "3" #════════════════════════════════════════════════════════ # Networks #════════════════════════════════════════════════════════ networks: app-network: driver: overlay attachable: true db-network: driver: overlay internal: true #════════════════════════════════════════════════════════ # Secrets #════════════════════════════════════════════════════════ secrets: db_password: external: true jwt_secret: external: true ``` ### 3.2 Environment Variables (.env files) ```bash # environments/production/.env # Harbor Registry HARBOR_REGISTRY=harbor.company.com # Application Versions (THIS IS WHAT YOU UPDATE!) FRONTEND_VERSION=v2.1.5 API_VERSION=v3.2.1 WORKER_VERSION=v1.8.3 # Database Configuration DATABASE_URL=postgresql://user@db-prod:5432/appdb # Redis Configuration REDIS_URL=redis://redis:6379 # Application Configuration JWT_SECRET_FILE=/run/secrets/jwt_secret LOG_LEVEL=info ENVIRONMENT=production ``` ### 3.3 Health Check Script ```bash #!/bin/bash # environments/production/healthcheck.sh set -e echo "═══════════════════════════════════════════════" echo "Running Health Checks for Production" echo "═══════════════════════════════════════════════" STACK_NAME="app-stack" FAILED=0 # Check if all services are running echo "" echo "1️⃣ Checking service status..." SERVICES=$(docker stack services $STACK_NAME --format "{{.Name}}") for service in $SERVICES; do REPLICAS=$(docker service ls --filter name=$service --format "{{.Replicas}}") echo " $service: $REPLICAS" # Check if service has failed replicas if echo "$REPLICAS" | grep -q "0/"; then echo " ❌ Service $service has NO running replicas!" FAILED=1 fi done # Check frontend health endpoint echo "" echo "2️⃣ Checking Frontend health endpoint..." if curl -sf http://localhost/health > /dev/null; then echo " ✅ Frontend health check PASSED" else echo " ❌ Frontend health check FAILED" FAILED=1 fi # Check API health endpoint echo "" echo "3️⃣ Checking API health endpoint..." if curl -sf http://localhost:3000/health > /dev/null; then echo " ✅ API health check PASSED" else echo " ❌ API health check FAILED" FAILED=1 fi # Check Redis connectivity echo "" echo "4️⃣ Checking Redis connectivity..." if docker exec $(docker ps -q -f name=${STACK_NAME}_redis) redis-cli ping | grep -q PONG; then echo " ✅ Redis connectivity PASSED" else echo " ❌ Redis connectivity FAILED" FAILED=1 fi # Check for recent errors in logs echo "" echo "5️⃣ Checking recent logs for errors..." ERROR_COUNT=$(docker service logs --since 5m $STACK_NAME | grep -i "error\|fatal\|panic" | wc -l) if [ $ERROR_COUNT -gt 10 ]; then echo " ⚠️ Found $ERROR_COUNT errors in last 5 minutes" FAILED=1 else echo " ✅ Error count acceptable: $ERROR_COUNT" fi echo "" echo "═══════════════════════════════════════════════" if [ $FAILED -eq 0 ]; then echo "✅ ALL HEALTH CHECKS PASSED" echo "═══════════════════════════════════════════════" exit 0 else echo "❌ HEALTH CHECKS FAILED" echo "═══════════════════════════════════════════════" exit 1 fi ``` --- ## 4. Environment Management Strategy ### 4.1 Promotion Flow ``` ┌─────────────────────────────────────────────────────────┐ │ Environment Promotion Flow │ └─────────────────────────────────────────────────────────┘ Developer updates image version in Git ↓ Development (Automatic) ├─ Deploy immediately ├─ Run health checks └─ ✅ If successful → enable Sandbox deployment ↓ (Manual approval required) Sandbox (Manual Trigger) ├─ QA team tests features ├─ Run integration tests └─ ✅ If approved → enable Testing deployment ↓ (Manual approval required) Testing (Manual Trigger) ├─ Full regression testing ├─ Performance testing └─ ✅ If approved → enable Production deployment ↓ (Manual approval required + confirmation) Production (Manual Trigger) ├─ Backup current state ├─ Deploy with blue-green strategy ├─ Run comprehensive health checks └─ ✅ Monitor or 🔄 Rollback if issues ``` ### 4.2 Deployment Approval Matrix | Environment | Approval Required | Who Can Approve | Rollback Strategy | |-------------|-------------------|-----------------|-------------------| | **Development** | ❌ No (Automatic) | N/A | Automatic on health check failure | | **Sandbox** | ✅ Yes (Manual) | Any Developer | Manual via GitLab UI | | **Testing** | ✅ Yes (Manual) | QA Lead, DevOps Lead | Manual via GitLab UI | | **Production** | ✅ Yes (Manual + Confirmation) | DevOps Lead, CTO | Automatic on failure + Manual option | ### 4.3 Change Management Workflow ```yaml # Example: Updating application version # 1. Developer receives new image from Harbor New image available: harbor.company.com/company/api:v3.2.2 # 2. Developer creates feature branch git checkout -b update-api-v3.2.2 # 3. Update version in Development environment # Edit: environments/development/.env API_VERSION=v3.2.2 # 4. Commit and push git add environments/development/.env git commit -m "feat: update API to v3.2.2 in development" git push origin update-api-v3.2.2 # 5. Create Merge Request in GitLab - Title: "Update API to v3.2.2" - Description: "New features: X, Y, Z. Bug fixes: A, B" - Assign to: DevOps team for review # 6. After MR approval and merge to main: - GitLab CI automatically deploys to Development - Monitor deployment - If successful, manually trigger Sandbox deployment # 7. QA tests in Sandbox - If approved, update Testing environment - Repeat process # 8. Production deployment - Update production/.env with new version - Create MR with detailed change log - Require approvals from: DevOps Lead + CTO - Schedule deployment window - Execute manual deployment - Monitor closely ``` --- ## 5. Rollback Strategy ### 5.1 Automatic Rollback (Health Check Failure) ```yaml # In docker-compose.yml - automatic rollback on failure services: api: deploy: update_config: failure_action: rollback # ← Automatic rollback! monitor: 60s # Monitor for 60 seconds rollback_config: parallelism: 2 # Roll back 2 at a time delay: 5s # 5s between rollbacks ``` **How it works:** 1. New version deploys 2. Docker Swarm monitors health checks for 60 seconds 3. If health checks fail → Automatic rollback to previous version 4. Previous version restored within 2-3 minutes ### 5.2 Manual Rollback via GitLab **Option A: Rollback via Git History** ```bash # GitLab Pipeline: rollback:production job # 1. Identify previous working version git log --oneline environments/production/.env # 2. Checkout previous commit git checkout -- environments/production/ # 3. Pipeline redeploys previous version # 4. Verify health checks ``` **Option B: Rollback via GitLab UI** ``` GitLab → Deployments → Environments → Production ↓ Click "Rollback" button ↓ Select previous successful deployment ↓ Confirm rollback ↓ Pipeline automatically executes rollback job ``` ### 5.3 Emergency Rollback Procedure ```bash #!/bin/bash # scripts/emergency-rollback.sh # FOR EMERGENCY USE ONLY - bypasses GitLab pipeline # Run directly on Swarm manager node STACK_NAME="app-stack" BACKUP_DIR="/backup/deployments" echo "🚨 EMERGENCY ROLLBACK INITIATED" # Find last backup LAST_BACKUP=$(ls -td $BACKUP_DIR/* | head -1) echo "Rolling back to: $LAST_BACKUP" # Extract previous image versions while read line; do SERVICE=$(echo $line | awk '{print $1}') IMAGE=$(echo $line | awk '{print $2}') echo "Rolling back $SERVICE to $IMAGE" docker service update --image $IMAGE ${STACK_NAME}_${SERVICE} done < "$LAST_BACKUP/services.txt" echo "✅ Emergency rollback completed" echo "⚠️ Remember to update Git repository to match!" ``` --- ## 6. Monitoring & Health Checks ### 6.1 Service-Level Health Checks ```yaml # In docker-compose.yml healthcheck: test: ["CMD", "curl", "-f", "http://localhost:3000/health"] interval: 30s # Check every 30 seconds timeout: 10s # Request timeout retries: 3 # Fail after 3 attempts start_period: 60s # Grace period for startup ``` ### 6.2 Stack-Level Monitoring ```bash #!/bin/bash # scripts/monitor-deployment.sh STACK_NAME="app-stack" while true; do clear echo "═══════════════════════════════════════════════" echo "Stack: $STACK_NAME - $(date)" echo "═══════════════════════════════════════════════" # Show service status docker stack services $STACK_NAME echo "" echo "Recent logs (last 10 lines):" docker service logs --tail=10 $STACK_NAME sleep 10 done ``` ### 6.3 Notification Integration ```yaml # Add to .gitlab-ci.yml after_script: - | if [ "$CI_JOB_STATUS" == "success" ]; then MESSAGE="✅ Deployment to $CI_ENVIRONMENT_NAME successful" else MESSAGE="❌ Deployment to $CI_ENVIRONMENT_NAME FAILED" fi # Send to Slack curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"$MESSAGE\nPipeline: $CI_PIPELINE_URL\"}" \ $SLACK_WEBHOOK_URL # Send email (if SMTP configured) echo "$MESSAGE" | mail -s "Deployment Notification" devops@company.com ``` --- ## 7. Implementation Roadmap ### Phase 1: Preparation (Week 1) **Day 1-2: Repository Setup** - [ ] Create `deployment-configs` repository in GitLab - [ ] Create directory structure (environments/, scripts/) - [ ] Add current docker-compose.yml to each environment - [ ] Create .env files with current versions - [ ] Commit initial structure **Day 3-4: GitLab Configuration** - [ ] Configure GitLab CI/CD variables: - `SWARM_DEV_HOST`, `SWARM_SANDBOX_HOST`, `SWARM_TEST_HOST`, `SWARM_PROD_HOST` - `SWARM_SSH_KEY` (SSH private key) - `HARBOR_USER`, `HARBOR_PASSWORD` - `SLACK_WEBHOOK_URL` (optional) - [ ] Create SSH keys for GitLab Runner → Swarm access - [ ] Test SSH connectivity from GitLab to each Swarm environment **Day 5: Scripts Development** - [ ] Create deploy.sh script - [ ] Create healthcheck.sh script - [ ] Create rollback.sh script - [ ] Test scripts manually on Development environment ### Phase 2: Pipeline Implementation (Week 2) **Day 1-2: Basic Pipeline** - [ ] Create .gitlab-ci.yml with validation stage only - [ ] Test syntax validation - [ ] Test image validation **Day 3: Development Deployment** - [ ] Add deploy:development job - [ ] Test automatic deployment to Development - [ ] Verify health checks work **Day 4: Sandbox & Testing** - [ ] Add deploy:sandbox job (manual) - [ ] Add deploy:testing job (manual) - [ ] Test manual approval workflow **Day 5: Production Deployment** - [ ] Add deploy:production job (manual + confirmation) - [ ] Add backup before deployment - [ ] Test on Friday afternoon (low traffic) ### Phase 3: Rollback Implementation (Week 3) **Day 1-2: Automatic Rollback** - [ ] Configure Docker Swarm automatic rollback - [ ] Test by deploying broken version - [ ] Verify automatic recovery **Day 3-4: Manual Rollback** - [ ] Implement rollback:production job - [ ] Test Git-based rollback - [ ] Document rollback procedure **Day 5: Emergency Procedures** - [ ] Create emergency-rollback.sh script - [ ] Test emergency rollback - [ ] Document for on-call team ### Phase 4: Monitoring & Optimization (Week 4) **Day 1-2: Monitoring** - [ ] Set up deployment notifications (Slack/Email) - [ ] Configure Prometheus metrics collection - [ ] Create Grafana dashboards for deployments **Day 3-4: Documentation** - [ ] Write deployment guide for developers - [ ] Write operations runbook - [ ] Create troubleshooting guide - [ ] Record demo video **Day 5: Team Training** - [ ] Train developers on new workflow - [ ] Train QA team on approval process - [ ] Train DevOps team on monitoring/rollback - [ ] Conduct Q&A session --- ## 8. Best Practices & Tips ### 8.1 Version Management **✅ DO:** ```bash # Use semantic versioning API_VERSION=v3.2.1 # ← Good: Clear, semantic version # Include Git commit hash for traceability API_VERSION=v3.2.1-abc123ef # Use immutable tags IMAGE=harbor.company.com/app:v1.2.3 # ← Good: Specific version ``` **❌ DON'T:** ```bash # Avoid mutable tags API_VERSION=latest # ← Bad: Can change unexpectedly # Avoid ambiguous versions API_VERSION=production # ← Bad: What version is this? ``` ### 8.2 Deployment Timing **Recommended deployment windows:** - **Development:** Anytime (automatic) - **Sandbox:** Business hours (9am-5pm) - **Testing:** Business hours (requires QA) - **Production:** - Normal changes: Tuesday-Thursday, 10am-2pm - Critical fixes: Anytime with proper approval - Avoid: Monday mornings, Friday afternoons, weekends ### 8.3 Communication **Before Production deployment:** ``` Slack announcement template: 📢 Production Deployment Scheduled 🗓 Date: January 15, 2026 ⏰ Time: 11:00 AM (EST) ⏱ Duration: ~15 minutes 📝 Changes: - API v3.2.1 → v3.2.2 (bug fixes) - Frontend v2.1.5 → v2.1.6 (UI improvements) 🔗 Release Notes: [link] 🔗 Rollback Plan: [link] Please report any issues to #devops-alerts ``` ### 8.4 Security Considerations ```yaml # Store sensitive data as Docker secrets secrets: db_password: external: true # ← Created outside compose file api_key: external: true # Never commit secrets to Git! # Use GitLab CI/CD variables for: # - SSH keys # - API tokens # - Passwords # - Certificates ``` ### 8.5 Troubleshooting Common Issues **Issue 1: Pipeline fails with "SSH connection refused"** ```bash # Solution: Verify SSH key in GitLab CI/CD variables # Test manually: ssh -i ~/.ssh/gitlab_rsa root@swarm-manager ``` **Issue 2: Image pull fails from Harbor** ```bash # Solution: Check registry credentials docker login harbor.company.com -u $HARBOR_USER -p $HARBOR_PASSWORD # Verify image exists: docker pull harbor.company.com/company/api:v3.2.1 ``` **Issue 3: Health checks fail after deployment** ```bash # Debug: Check service logs docker service logs app-stack_api --tail 100 # Check service status docker service ps app-stack_api # Manual health check curl http://localhost:3000/health ``` **Issue 4: Deployment stuck "pending"** ```bash # Check swarm node status docker node ls # Check resource availability docker node inspect swarm-worker-1 | grep Resources -A 10 # Check for failed tasks docker service ps app-stack_api --no-trunc ``` --- ## 9. Success Metrics ### 9.1 Key Performance Indicators **Before Automation:** - 📊 Deployment frequency: 1-2 per week - ⏱ Average deployment time: 30-60 minutes per environment - 🐛 Deployment errors: ~20% (typos, wrong tags) - 🔄 Rollback time: 1-2 hours (manual) - 📝 Audit trail: Partial (chat logs, manual notes) **After Automation (Target):** - 📊 Deployment frequency: 5-10 per week - ⏱ Average deployment time: 5-10 minutes per environment - 🐛 Deployment errors: <2% (automated validation) - 🔄 Rollback time: 2-3 minutes (automatic) - 📝 Audit trail: Complete (Git history + GitLab logs) ### 9.2 Success Criteria **Week 4 Evaluation:** - [ ] All 4 environments deployed via GitLab CI/CD - [ ] Zero manual SSH deployments - [ ] At least 5 successful Production deployments - [ ] At least 1 successful rollback test - [ ] Team can deploy without DevOps assistance - [ ] Complete audit trail for all deployments - [ ] Average deployment time < 15 minutes --- ## 10. Conclusion & Next Steps ### Current State ❌ Manual bash script deployments ❌ No audit trail ❌ Error-prone process ❌ Slow rollbacks ### Target State (After Implementation) ✅ Automated GitLab CI/CD pipelines ✅ Complete Git-based audit trail ✅ Validated deployments with health checks ✅ 2-minute automatic rollbacks ✅ Self-service for developers ### Immediate Next Steps 1. **This Week:** - Create GitLab repository structure - Configure CI/CD variables - Test SSH connectivity 2. **Next Week:** - Implement basic pipeline - Test Development deployments - Add validation stages 3. **Week 3-4:** - Roll out to all environments - Implement rollback procedures - Train team ### Resources Needed - **Time Investment:** 2-4 weeks (1 DevOps engineer) - **Infrastructure:** GitLab Runner (existing OK) - **Training:** 2-3 hours team training session - **Documentation:** Deployment guide + runbooks ### Support & Questions For implementation assistance: - 📧 Email: devops@company.com - 💬 Slack: #devops-automation - 📖 Documentation: https://gitlab.company.com/deployment-configs --- **Document Version:** 1.0 **Last Updated:** Январь 2026 **Status:** Ready for Implementation **Author:** DevOps Team **Review Date:** After Phase 2 completion