diff --git a/docs/gitops-cicd/02-architecture.md b/docs/gitops-cicd/02-architecture.md
deleted file mode 100644
index 75eb32e..0000000
--- a/docs/gitops-cicd/02-architecture.md
+++ /dev/null
@@ -1,855 +0,0 @@
-# FinTech GitOps CI/CD - Архитектура решения
-
-**Версия:** 1.0  
-**Дата:** Январь 2026  
-**Целевая аудитория:** Архитекторы, DevOps, Инфраструктура, Безопасность
-
----
-
-## Содержание
-
-1. [Общая архитектура](#1-общая-архитектура)
-2. [Сетевая архитектура](#2-сетевая-архитектура)
-3. [Зоны и их назначение](#3-зоны-и-их-назначение)
-4. [Потоки данных](#4-потоки-данных)
-5. [High Availability и масштабирование](#5-high-availability-и-масштабирование)
-6. [Disaster Recovery](#6-disaster-recovery)
-
----
-
-## 1. Общая архитектура
-
-### 1.1 Принципы проектирования
-
-**Defense in Depth:**
-Многоуровневая защита с изоляцией на каждом уровне:
-- Сетевая сегментация через VLAN
-- Firewall между всеми зонами
-- Application-level authentication и authorization
-- Encryption at rest и in transit
-- Audit logging на всех уровнях
-
-**Least Privilege:**
-Минимальные необходимые права для каждого компонента:
-- Service accounts с ограниченными permissions
-- Network access только к необходимым endpoints
-- Time-bound credentials где возможно
-- Регулярная ротация secrets
-
-**Immutable Infrastructure:**
-Инфраструктура как код, изменения только через Git:
-- Нет ручных изменений на серверах
-- Все изменения version controlled
-- Reproducible deployments
-- Easy rollback через Git history
-
-**Observability:**
-Полная видимость всех процессов:
-- Централизованное логирование
-- Метрики со всех компонентов
-- Distributed tracing для запросов
-- Audit trail для compliance
-
-### 1.2 Логические слои
-
-**Presentation Layer (User Interface):**
-- Portainer UI для визуального управления Swarm
-- Grafana для дашбордов и метрик
-- Jenkins Blue Ocean для CI/CD визуализации
-- Ollama web interface для AI взаимодействия
-- Gitea web UI для repository management
-
-**API Layer:**
-- Docker Swarm API для управления кластером
-- Harbor API для registry операций
-- Gitea API для Git operations
-- Jenkins API для trigger builds
-- Prometheus API для метрик
-- MCP Server API для AI интеграции
-
-**Service Layer (Business Logic):**
-- GitOps Operator - автоматическая синхронизация
-- Jenkins pipelines - CI/CD логика
-- Harbor webhooks - уведомления о новых образах
-- AlertManager - правила для алертов
-- AI models - обработка запросов
-
-**Data Layer:**
-- PostgreSQL - реляционные данные
-- Git repositories - код и конфигурации
-- Harbor storage - Docker образы
-- Prometheus TSDB - временные ряды метрик
-- Loki - логи
-- Vector DB - embeddings для AI
-
-**Infrastructure Layer:**
-- Docker Swarm - orchestration platform
-- Overlay networks - service communication
-- Shared storage - persistent data
-- Backup systems - disaster recovery
-
----
-
-## 2. Сетевая архитектура
-
-### 2.1 VLAN сегментация
-
-**VLAN 10 - Management & CI/CD Zone:**
-- Subnet: 10.10.10.0/24
-- Gateway: 10.10.10.1
-- Компоненты: Gitea, Jenkins, Harbor, GitOps Operator, Portainer
-- Доступ: Только через VPN с MFA
-- Изоляция: Строгий firewall на границе
-
-**VLAN 20 - Docker Swarm Cluster Zone:**
-- Subnet: 10.20.0.0/16 (для большого количества containers)
-- Manager subnet: 10.20.1.0/24
-- Worker subnet: 10.20.2.0/23
-- Gateway: 10.20.0.1
-- Компоненты: Swarm managers, workers, overlay networks
-- Доступ: Только из Management zone и Monitoring zone
-- Изоляция: Encrypted overlay network внутри
-
-**VLAN 30 - AI & Analytics Zone:**
-- Subnet: 10.30.10.0/24
-- Gateway: 10.30.10.1
-- Компоненты: Ollama, MCP Server, Vector Database
-- Доступ: Read-only к источникам данных
-- Изоляция: Не может инициировать изменения в других зонах
-
-**VLAN 40 - Monitoring & Logging Zone:**
-- Subnet: 10.40.10.0/24
-- Gateway: 10.40.10.1
-- Компоненты: Prometheus, Grafana, Loki, AlertManager
-- Доступ: Read-only metrics collection
-- Изоляция: Не может управлять компонентами
-
-**VLAN 50 - Data & Database Zone:**
-- Subnet: 10.50.0.0/16
-- Infrastructure DB subnet: 10.50.10.0/24
-- Application DB subnet: 10.50.20.0/23
-- Storage subnet: 10.50.30.0/24
-- Gateway: 10.50.0.1
-- Компоненты: PostgreSQL, Application databases, Shared storage
-- Доступ: Строго контролируемый, encrypted connections
-- Изоляция: Самая строгая, audit всех подключений
-
-**VLAN 60 - Backup & DR Zone:**
-- Subnet: 10.60.10.0/24
-- Gateway: 10.60.10.1
-- Компоненты: Backup server, long-term storage
-- Доступ: Write-only для backup agents, read для recovery
-- Изоляция: Offline storage, air-gapped где возможно
-
-### 2.2 Firewall правила
-
-**Принцип:** Deny all, allow explicitly needed
-
-**Management VLAN → Swarm VLAN:**
-```
-Source: 10.10.10.40 (GitOps Operator)
-Destination: 10.20.1.0/24 (Swarm Managers)
-Ports: 2377/tcp (cluster management)
-Action: ALLOW
-Logging: YES
-
-Source: 10.10.10.50 (Portainer)
-Destination: 10.20.1.0/24 (Swarm Managers)
-Ports: 2375/tcp (Docker API over TLS)
-Action: ALLOW
-Logging: YES
-
-All other traffic: DENY
-```
-
-**Swarm VLAN → Harbor (Management VLAN):**
-```
-Source: 10.20.0.0/16 (All Swarm nodes)
-Destination: 10.10.10.30 (Harbor)
-Ports: 443/tcp, 5000/tcp (HTTPS, Docker registry)
-Protocol: TLS 1.3 with mutual authentication
-Action: ALLOW
-Logging: YES
-
-All other traffic: DENY
-```
-
-**AI VLAN → Data Sources:**
-```
-Source: 10.30.10.20 (MCP Server)
-Destination: Multiple (через MCP connectors)
-Ports: Varies (SSH 22, HTTPS 443, PostgreSQL 5432, etc.)
-Access: READ-ONLY
-Authentication: Service account per destination
-Logging: ALL QUERIES LOGGED
-Action: ALLOW with rate limiting
-
-Write operations: DENY
-```
-
-**Monitoring VLAN → All Zones:**
-```
-Source: 10.40.10.10 (Prometheus)
-Destination: ALL VLANs
-Ports: Metrics endpoints (обычно 9090-9999)
-Access: READ-ONLY metrics scraping
-Action: ALLOW
-Logging: NO (too verbose, metrics only)
-
-Any non-metrics ports: DENY
-```
-
-**Data VLAN → Backup VLAN:**
-```
-Source: 10.50.0.0/16 (All databases)
-Destination: 10.60.10.10 (Backup server)
-Ports: Backup protocol specific
-Direction: ONE-WAY (source → backup only)
-Action: ALLOW
-Logging: YES
-Encryption: MANDATORY
-
-Reverse direction: DENY (except for restore procedures)
-```
-
-### 2.3 Внешнее подключение
-
-**VPN Gateway:**
-- Публичный IP для VPN подключений
-- Multi-factor authentication обязательна
-- Certificate-based authentication + one-time password
-- Split-tunnel запрещен (все через VPN)
-- Session timeout: 8 часов
-- Idle timeout: 30 минут
-- Disconnect после 3 неудачных MFA попыток
-
-**Jump Host/Bastion:**
-- Единая точка входа после VPN
-- Session recording для аудита
-- No direct access to production systems, только через jump host
-- Authorized keys management централизованно
-- Automatic logout после 15 минут idle
-- Audit log всех команд
-
-**Разрешенные пользователи:**
-- Developers: Доступ к Gitea, Jenkins, Portainer (read-only для production)
-- DevOps: Полный доступ ко всем системам управления
-- Security team: Read-only audit доступ ко всему
-- Managers: Grafana и reporting dashboards только
-
----
-
-## 3. Зоны и их назначение
-
-### 3.1 Management & CI/CD Zone
-
-**Назначение:**
-Централизованное управление кодом, CI/CD процессами и container registry.
-
-**Критичность:** HIGH - простой влияет на возможность деплоя новых версий
-
-**Компоненты:**
-
-**Gitea (10.10.10.10):**
-- Роль: Single source of truth для всего кода и конфигураций
-- Взаимодействие: Принимает push от developers, отправляет webhooks в Jenkins
-- Зависимости: PostgreSQL (VLAN 50), Shared storage для Git LFS
-- SLA: 99.9% uptime
-
-**Jenkins (10.10.10.20):**
-- Роль: CI automation, build и test applications
-- Взаимодействие: Получает webhooks от Gitea, push образов в Harbor, update Git
-- Зависимости: Gitea, Harbor, Docker build agents
-- SLA: 99.5% uptime (может работать в degraded mode)
-
-**Harbor (10.10.10.30):**
-- Роль: Enterprise container registry с security scanning
-- Взаимодействие: Принимает push от Jenkins, pull от Swarm nodes
-- Зависимости: PostgreSQL, Object storage для images
-- SLA: 99.9% uptime (критичен для pull образов)
-
-**GitOps Operator (10.10.10.40):**
-- Роль: Автоматическая синхронизация Git → Swarm
-- Взаимодействие: Мониторит Gitea, применяет изменения в Swarm через API
-- Зависимости: Gitea, Docker Swarm API
-- SLA: 99.9% uptime
-
-**Portainer (10.10.10.50):**
-- Роль: Web UI для управления и мониторинга Swarm
-- Взаимодействие: Подключается к Swarm managers через Docker API
-- Зависимости: Docker Swarm API, PostgreSQL для своей базы
-- SLA: 99% uptime (не критичен, есть CLI альтернатива)
-
-**Резервирование:**
-- Gitea: Master-slave replication, automated failover
-- Jenkins: Standby instance в warm mode
-- Harbor: Geo-replication на secondary site
-- GitOps Operator: Active-passive pair
-- Portainer: Standby instance
-
-### 3.2 Docker Swarm Cluster Zone
-
-**Назначение:**
-Выполнение production workloads с high availability и load balancing.
-
-**Критичность:** CRITICAL - прямое влияние на бизнес сервисы
-
-**Swarm Manager Nodes (10.20.1.1-3):**
-- Количество: 3 для кворума (рекомендуется нечетное число)
-- Роль: Cluster orchestration, scheduling, API endpoint
-- Raft consensus: Нужно минимум 2 alive из 3 для работы кластера
-- Workload: НЕ запускают application containers (только infrastructure)
-- CPU: 4 vCPU каждый
-- RAM: 8 GB каждый
-- Disk: 200 GB SSD каждый
-- Network: 10 Gbps для Raft communication
-
-**Swarm Worker Nodes (10.20.2.1-N):**
-- Количество: Зависит от workload, минимум 3 для redundancy
-- Роль: Выполнение application containers
-- Constraints: Можно маркировать ноды для specific workloads
-- CPU: 8-16 vCPU каждый
-- RAM: 32-64 GB каждый
-- Disk: 500 GB SSD каждый
-- Network: 10 Gbps для overlay network performance
-
-**Overlay Networks:**
-- Automatic encryption (IPSec)
-- Service discovery через DNS
-- Load balancing через routing mesh
-- Изоляция между разными стеками
-
-**Secrets Management:**
-- Docker Swarm secrets encrypted at rest
-- Rotation через stack update
-- Mount как files в containers
-- Audit log доступа к secrets
-
-**Резервирование:**
-- Manager nodes: N-1 failure tolerance (3 ноды = 1 failure ok)
-- Worker nodes: Application replicas распределены по разным нодам
-- Persistent data: Replicated storage (GlusterFS или NFS с HA)
-- Network: Bonded interfaces для redundancy
-
-### 3.3 AI & Analytics Zone
-
-**Назначение:**
-Предоставление AI-powered помощи через анализ internal data sources.
-
-**Критичность:** MEDIUM - удобство, но не критично для операций
-
-**Ollama Server (10.30.10.10):**
-- Роль: Запуск AI моделей локально на собственном железе
-- Модели: Llama 3.3 70B, Qwen 2.5 Coder, DeepSeek, и другие
-- Взаимодействие: Получает запросы от пользователей, context от MCP Server
-- Требования: GPU highly recommended для производительности
-- CPU: 16 vCPU (или меньше если есть GPU)
-- RAM: 64 GB (модели требуют много памяти)
-- GPU: NVIDIA A100 40GB или 2x RTX 4090 24GB (опционально но рекомендуется)
-- Disk: 2 TB NVMe SSD (модели весят 10-100 GB каждая)
-- Network: 10 Gbps для быстрого ответа
-
-**MCP Server (10.30.10.20):**
-- Роль: Интеграция AI с источниками данных (Gitea, Swarm, DBs, logs)
-- Connectors: Модульные плагины для каждого источника
-- Взаимодействие: Read-only запросы к data sources, передача context в Ollama
-- Security: Service accounts для каждого connector, audit всех запросов
-- CPU: 8 vCPU
-- RAM: 16 GB
-- Disk: 100 GB SSD
-- Network: 1 Gbps
-
-**Vector Database (10.30.10.30):**
-- Роль: Хранение embeddings документации для semantic search
-- Технология: Qdrant или Milvus
-- Размер: Зависит от количества документации
-- CPU: 4 vCPU
-- RAM: 16 GB (зависит от размера index)
-- Disk: 500 GB SSD
-- Network: 1 Gbps
-
-**Data Flow:**
-Пользователь → Ollama → MCP Server → (параллельно):
-- Gitea MCP Connector → Gitea (документация, код)
-- Swarm MCP Connector → Docker API (статус, логи)
-- Database MCP Connector → PostgreSQL (метаданные)
-- Prometheus MCP Connector → Metrics
-- Loki MCP Connector → Logs
-→ Агрегированный context → Ollama → Ответ пользователю
-
-**Резервирование:**
-- Ollama: Standby instance (warm standby)
-- MCP Server: Active-passive pair
-- Vector DB: Replicated для HA
-
-### 3.4 Monitoring & Logging Zone
-
-**Назначение:**
-Observability инфраструктуры для проактивного мониторинга и troubleshooting.
-
-**Критичность:** HIGH - необходим для detection проблем
-
-**Prometheus (10.40.10.10):**
-- Роль: Сбор и хранение метрик временных рядов
-- Scrape targets: Все компоненты инфраструктуры
-- Retention: 30 дней в Prometheus, long-term в Thanos/VictoriaMetrics
-- CPU: 8 vCPU
-- RAM: 32 GB
-- Disk: 2 TB HDD (time-series data)
-- Network: 1 Gbps
-
-**Grafana (10.40.10.20):**
-- Роль: Визуализация метрик и логов
-- Dashboards: Преднастроенные для каждого компонента
-- Alerting: Визуальный редактор алертов
-- CPU: 4 vCPU
-- RAM: 8 GB
-- Disk: 100 GB SSD
-- Network: 1 Gbps
-
-**Loki (10.40.10.30):**
-- Роль: Централизованное хранение логов
-- Agents: Promtail на каждой ноде
-- Retention: 90 дней
-- CPU: 8 vCPU
-- RAM: 16 GB
-- Disk: 5 TB HDD (logs)
-- Network: 1 Gbps
-
-**AlertManager (10.40.10.40):**
-- Роль: Обработка и роутинг алертов
-- Интеграции: Slack, Email, PagerDuty, Telegram
-- Deduplication: Группировка похожих алертов
-- CPU: 2 vCPU
-- RAM: 4 GB
-- Disk: 50 GB SSD
-- Network: 1 Gbps
-
-**Резервирование:**
-- Prometheus: Federated setup, multiple instances
-- Grafana: Load balanced instances
-- Loki: Distributed deployment
-- AlertManager: Clustered для HA
-
-### 3.5 Data & Database Zone
-
-**Назначение:**
-Хранение persistent data для инфраструктуры и приложений.
-
-**Критичность:** CRITICAL - потеря данных недопустима
-
-**Infrastructure PostgreSQL Cluster (10.50.10.10-11):**
-- Роль: Базы данных для Gitea, Harbor, Portainer
-- Топология: Master-slave с automatic failover
-- Backup: Continuous WAL archiving + daily full backup
-- Encryption: At rest (LUKS) и in transit (TLS)
-- CPU: 8 vCPU per instance
-- RAM: 16 GB per instance
-- Disk: 500 GB SSD per instance
-- Network: 10 Gbps
-
-**Application Databases (10.50.20.x):**
-- Роль: Базы данных бизнес-приложений
-- Технологии: Зависит от приложений (PostgreSQL, MySQL, MongoDB)
-- Isolation: Каждое приложение в своей database/schema
-- Backup: Application-specific strategy
-- Resources: Зависит от workload
-
-**Shared Storage (10.50.30.1-3):**
-- Роль: Persistent volumes для Swarm services
-- Технология: GlusterFS (replicated) или NFS с HA
-- Replication: 3x для fault tolerance
-- Snapshots: Каждый час, retention 7 дней
-- Capacity: 10 TB (grows as needed)
-- Network: 10 Gbps для I/O performance
-
-**Резервирование:**
-- PostgreSQL: Synchronous replication, automatic failover
-- Shared Storage: Distributed replication (GlusterFS 3-way)
-- Backups: Multiple copies в разных locations
-
-### 3.6 Backup & DR Zone
-
-**Назначение:**
-Защита от data loss и быстрое восстановление при катастрофах.
-
-**Критичность:** CRITICAL для долгосрочной устойчивости бизнеса
-
-**Backup Server (10.60.10.10):**
-- Роль: Прием и хранение backups
-- Technology: Bacula или Bareos (enterprise backup solution)
-- Scheduling: Automated по расписанию + on-demand
-- Encryption: All backups encrypted at rest
-- CPU: 4 vCPU
-- RAM: 8 GB
-- Disk: 20 TB HDD (RAID 10)
-- Network: 10 Gbps для fast backups
-
-**Backup Strategy:**
-
-**Hourly Incremental:**
-- Git repositories (только изменения)
-- Retention: 48 hours
-
-**Daily Full:**
-- Databases (full dump)
-- Docker Swarm configs
-- Важные логи
-- Retention: 30 days
-
-**Weekly Full:**
-- Полный snapshot всей инфраструктуры
-- VM images, configs, data
-- Retention: 12 weeks
-
-**Monthly Archives:**
-- Long-term compliance storage
-- Retention: 7 years (regulatory requirement)
-
-**DR Site (опционально, в другом ЦОД):**
-- Роль: Geographic redundancy
-- Replication: Asynchronous из primary site
-- RTO (Recovery Time Objective): 4 hours
-- RPO (Recovery Point Objective): 15 minutes
-- Testing: Quarterly DR drills
-
----
-
-## 4. Потоки данных
-
-### 4.1 Development Workflow
-
-**Developer commits code:**
-```
-Developer Workstation
-↓ (SSH через VPN)
-Gitea (VLAN 10)
-↓ (Webhook HTTPS + signature verification)
-Jenkins (VLAN 10)
-↓ (git clone through SSH)
-Gitea
-```
-
-**CI Pipeline execution:**
-```
-Jenkins
-↓ (build application)
-Build Agent (ephemeral container/VM)
-↓ (run tests)
-Test results → Archived in Jenkins
-↓ (build Docker image)
-Docker build agent
-↓ (security scan with Trivy)
-Vulnerability report
-↓ (docker push через TLS + creds)
-Harbor (VLAN 10)
-```
-
-**Update GitOps repo:**
-```
-Jenkins
-↓ (update image tag в compose file)
-Gitea GitOps repository
-↓ (commit + push)
-Gitea
-```
-
-### 4.2 CD Workflow
-
-**GitOps sync:**
-```
-GitOps Operator (VLAN 10)
-↓ (poll Git repository каждые 30 sec)
-Gitea
-↓ (detect changes)
-GitOps Operator
-↓ (docker stack deploy через Swarm API)
-Swarm Managers (VLAN 20)
-```
-
-**Swarm orchestration:**
-```
-Swarm Manager
-↓ (schedule tasks на workers)
-Swarm Scheduler
-↓ (pull image from Harbor)
-Worker Nodes ↔ Harbor (VLAN 10)
-↓ (start containers)
-Application Running
-```
-
-**Service update (rolling):**
-```
-Swarm Manager
-↓ (stop 1 task из N)
-Worker Node A
-↓ (start new task с новым image)
-Worker Node B
-↓ (verify health check)
-Health Check (5 consecutive passes required)
-↓ (proceed to next task)
-Repeat until all tasks updated
-```
-
-### 4.3 AI Interaction Flow
-
-**User query:**
-```
-User (через Web UI или API)
-↓ (HTTPS request)
-Ollama Server (VLAN 30)
-↓ (request context через MCP protocol)
-MCP Server (VLAN 30)
-```
-
-**MCP gathers context (parallel):**
-```
-MCP Server
-├→ Gitea MCP Connector → Gitea API (docs, code)
-├→ Swarm MCP Connector → Docker API (logs, metrics)
-├→ Database MCP Connector → PostgreSQL (metadata)
-├→ Prometheus MCP Connector → Prometheus API (metrics)
-└→ Loki MCP Connector → Loki API (logs)
-↓ (all responses aggregated)
-MCP Server
-↓ (full context sent to AI)
-Ollama Server
-↓ (generate response)
-User
-```
-
-**AI response with action:**
-```
-AI determines action needed
-↓ (if requires change)
-AI suggests change to user
-↓ (user approves)
-Change committed to Git
-↓ (normal GitOps flow)
-Applied to infrastructure
-```
-
-### 4.4 Monitoring Data Flow
-
-**Metrics collection:**
-```
-All Infrastructure Components
-↓ (expose metrics endpoints)
-Prometheus Exporters
-↓ (scrape every 15 seconds)
-Prometheus (VLAN 40)
-↓ (evaluate alert rules)
-AlertManager (VLAN 40)
-↓ (route notifications)
-Slack/Email/PagerDuty
-```
-
-**Logs collection:**
-```
-All Containers
-↓ (stdout/stderr)
-Docker logging driver
-↓ (forward)
-Promtail Agent (на каждой ноде)
-↓ (push)
-Loki (VLAN 40)
-↓ (index и store)
-Loki Storage
-↓ (query)
-Grafana или CLI
-```
-
-**Audit logs:**
-```
-All Infrastructure Actions
-├→ Gitea (Git operations)
-├→ Docker Swarm (API calls)
-├→ Harbor (image push/pull)
-├→ Jenkins (builds)
-└→ SSH sessions (bastion)
-↓ (forward)
-Centralized Syslog
-↓ (store)
-Long-term Audit Storage (7 years)
-```
-
----
-
-## 5. High Availability и масштабирование
-
-### 5.1 HA Strategy
-
-**Tier 1 - Critical (99.99% uptime):**
-- Docker Swarm (application platform)
-- Harbor (cannot deploy without it)
-- Shared Storage (persistent data)
-- Strategy: Active-Active где возможно, N+1 redundancy
-
-**Tier 2 - Important (99.9% uptime):**
-- Gitea (code access)
-- GitOps Operator (CD automation)
-- Databases (infrastructure metadata)
-- Strategy: Active-Passive с automatic failover
-
-**Tier 3 - Nice to have (99% uptime):**
-- Jenkins (can wait for restore)
-- Portainer (CLI alternative exists)
-- Monitoring (short downtime acceptable)
-- Strategy: Warm standby, manual failover
-
-### 5.2 Scaling Points
-
-**Vertical Scaling (увеличение ресурсов):**
-- Databases: Больше RAM для cache
-- Ollama: Добавление GPU для speed
-- Harbor storage: Больше disk для images
-- Limit: Hardware limitations
-
-**Horizontal Scaling (добавление instances):**
-- Swarm Workers: Добавить ноды для capacity
-- Jenkins Agents: Dynamic scaling по demand
-- Prometheus: Federation для distributed scraping
-- MCP Connectors: Независимые instances per source
-
-**Data Scaling:**
-- PostgreSQL: Read replicas для read-heavy workloads
-- Harbor: Geo-replication для distributed teams
-- Loki: Sharding по времени
-- Git: Repository sharding (не часто нужно)
-
-### 5.3 Capacity Planning
-
-**Metrics для отслеживания:**
-- CPU utilization (target <70% average)
-- Memory utilization (target <80%)
-- Disk usage (alert при 80%, critical при 90%)
-- Network bandwidth (baseline + trend analysis)
-- IOPS (SSD wear, performance degradation)
-
-**Growth projections:**
-- Applications: 20% growth в год
-- Code repositories: 30% growth в год (accumulative)
-- Logs: 50% growth в год (more verbose logging)
-- Metrics retention: Linear с количеством services
-
-**Scaling triggers:**
-- Add Swarm worker когда CPU >80% sustained
-- Upgrade database когда query latency >100ms p95
-- Expand storage когда >75% used
-- Add Jenkins agents когда queue >5 builds
-
----
-
-## 6. Disaster Recovery
-
-### 6.1 RTO и RPO Targets
-
-**Recovery Time Objective (RTO):**
-- Tier 1 services: 1 hour
-- Tier 2 services: 4 hours
-- Tier 3 services: 24 hours
-- Full infrastructure: 8 hours
-
-**Recovery Point Objective (RPO):**
-- Databases: 15 minutes (via WAL shipping)
-- Git repositories: 1 hour (hourly backup)
-- Docker images: 0 (replicated to DR)
-- Configs: 0 (in Git)
-- Logs: 1 hour (buffered before ingestion)
-
-### 6.2 DR Scenarios
-
-**Scenario 1: Single server failure**
-- Detection: Automated monitoring
-- Response: Automatic failover to redundant instance
-- Recovery time: <5 minutes
-- Data loss: None (active-active or sync replication)
-
-**Scenario 2: Network partition**
-- Detection: Raft consensus loss, monitoring alerts
-- Response: Manual investigation, possible split-brain resolution
-- Recovery time: 30 minutes
-- Data loss: Possible if write to minority partition
-
-**Scenario 3: Data center failure**
-- Detection: Total loss of connectivity
-- Response: Failover to DR site
-- Recovery time: 4 hours (RTO)
-- Data loss: Up to 15 minutes (RPO)
-
-**Scenario 4: Ransomware/Corruption**
-- Detection: File integrity monitoring, unusual encryption activity
-- Response: Isolate affected systems, restore from clean backup
-- Recovery time: 8 hours (full rebuild)
-- Data loss: Up to last clean backup (potentially hours)
-
-**Scenario 5: Human error (accidental delete)**
-- Detection: Git history, audit logs
-- Response: Restore from backup or Git revert
-- Recovery time: 1-2 hours
-- Data loss: None (everything in version control)
-
-### 6.3 Recovery Procedures
-
-**Database Recovery:**
-- Stop application access
-- Restore base backup
-- Apply WAL logs до point-in-time
-- Verify data integrity
-- Resume application access
-
-**Git Repository Recovery:**
-- Clone from DR site или restore backup
-- Verify commit history integrity
-- Restore hooks и configurations
-- Test push/pull operations
-- Notify team of recovery
-
-**Docker Swarm Recovery:**
-- Deploy manager nodes from backup configs
-- Join worker nodes
-- Restore network и volume configs
-- Deploy stacks from Git
-- Verify service health
-
-**Full Site Recovery:**
-- Deploy infrastructure от Terraform/IaC
-- Restore databases from backup
-- Clone Git repositories
-- Deploy Docker Swarm
-- Apply all stacks from GitOps
-- Verify end-to-end functionality
-- Switch DNS to DR site
-- Notify stakeholders
-
-### 6.4 Testing DR
-
-**Monthly:**
-- Restore тест на отдельной инфраструктуре
-- Verify backup integrity
-- Test recovery procedures
-
-**Quarterly:**
-- Full DR drill с failover на DR site
-- Measure actual RTO/RPO
-- Update procedures based на findings
-
-**Annually:**
-- Tabletop exercise с всеми stakeholders
-- Test business continuity plans
-- Update и train на changes
-
----
-
-**Следующие документы:**
-- **03-security-compliance.md** - Детальные требования безопасности
-- **04-component-specifications.md** - Технические спецификации компонентов
-- **05-development-environment.md** - Dev окружение для тестирования
-
----
-
-**Утверждение:**
-- Enterprise Architect: _______________
-- Security Architect: _______________
-- Infrastructure Lead: _______________
-- Date: _______________
\ No newline at end of file