docs: add comprehensive development environment guide
This commit is contained in:
500
docs/gitops-cicd/05-development-environment.md
Normal file
500
docs/gitops-cicd/05-development-environment.md
Normal file
@@ -0,0 +1,500 @@
|
|||||||
|
# FinTech GitOps CI/CD - Development Environment
|
||||||
|
|
||||||
|
**Версия:** 1.0
|
||||||
|
**Дата:** Январь 2026
|
||||||
|
**Целевая аудитория:** DevOps Team, Infrastructure, Development Team
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Содержание
|
||||||
|
|
||||||
|
1. [Назначение Dev Environment](#1-назначение-dev-environment)
|
||||||
|
2. [Архитектура Dev окружения](#2-архитектура-dev-окружения)
|
||||||
|
3. [Технические требования](#3-технические-требования)
|
||||||
|
4. [План развертывания](#4-план-развертывания)
|
||||||
|
5. [Тестирование и валидация](#5-тестирование-и-валидация)
|
||||||
|
6. [Переход к Production](#6-переход-к-production)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Назначение Dev Environment
|
||||||
|
|
||||||
|
### 1.1 Зачем нужен отдельный Dev Environment
|
||||||
|
|
||||||
|
**Безопасность:**
|
||||||
|
- Тестирование новых компонентов без риска для production
|
||||||
|
- Эксперименты с конфигурациями
|
||||||
|
- Обучение команды на безопасной среде
|
||||||
|
- Валидация security политик перед production
|
||||||
|
|
||||||
|
**Проверка интеграций:**
|
||||||
|
- Тестирование CI/CD pipelines
|
||||||
|
- Валидация GitOps workflows
|
||||||
|
- Проверка backup/restore процедур
|
||||||
|
- Тестирование disaster recovery scenarios
|
||||||
|
|
||||||
|
**Разработка и отладка:**
|
||||||
|
- Development приложений в production-like окружении
|
||||||
|
- Debugging проблем без impact на production
|
||||||
|
- Performance testing и tuning
|
||||||
|
- Capacity planning и load testing
|
||||||
|
|
||||||
|
**Обучение команды:**
|
||||||
|
- Hands-on тренинг на реальной инфраструктуре
|
||||||
|
- Практика troubleshooting
|
||||||
|
- Изучение новых инструментов
|
||||||
|
- Onboarding новых сотрудников
|
||||||
|
|
||||||
|
### 1.2 Отличия от Production
|
||||||
|
|
||||||
|
**Масштаб:**
|
||||||
|
- Меньше ресурсов (~40% от production)
|
||||||
|
- Меньше replicas для services
|
||||||
|
- Shorter retention periods для данных
|
||||||
|
- Simplified HA (не обязательна полная redundancy)
|
||||||
|
|
||||||
|
**Данные:**
|
||||||
|
- Synthetic/mock данные (НЕ production data)
|
||||||
|
- Anonymized копии production data где необходимо
|
||||||
|
- Меньшие dataset sizes
|
||||||
|
- Shorter retention
|
||||||
|
|
||||||
|
**Availability:**
|
||||||
|
- SLA не критичны (допустимы downtime для maintenance)
|
||||||
|
- Может быть выключен в нерабочее время
|
||||||
|
- Scheduled maintenance windows без согласования
|
||||||
|
|
||||||
|
**Security:**
|
||||||
|
- Менее строгие access controls (больше людей имеют доступ)
|
||||||
|
- Simplified authentication (можно без MFA для dev team)
|
||||||
|
- Relaxed network policies (для удобства debugging)
|
||||||
|
- НО: все равно следуем основным security practices
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Архитектура Dev окружения
|
||||||
|
|
||||||
|
### 2.1 Network Layout
|
||||||
|
|
||||||
|
**Separate VLAN от Production:**
|
||||||
|
|
||||||
|
```
|
||||||
|
Dev Environment VLAN: 10.100.0.0/16
|
||||||
|
|
||||||
|
Зоны (подсети):
|
||||||
|
├── Management Zone: 10.100.10.0/24
|
||||||
|
│ ├── Gitea Dev: 10.100.10.10
|
||||||
|
│ ├── Jenkins Dev: 10.100.10.20
|
||||||
|
│ ├── Harbor Dev: 10.100.10.30
|
||||||
|
│ ├── GitOps Operator Dev: 10.100.10.40
|
||||||
|
│ └── Portainer Dev: 10.100.10.50
|
||||||
|
│
|
||||||
|
├── Swarm Cluster Zone: 10.100.20.0/24
|
||||||
|
│ ├── Manager: 10.100.20.1
|
||||||
|
│ └── Workers: 10.100.20.2-4 (3 workers)
|
||||||
|
│
|
||||||
|
├── AI Zone: 10.100.30.0/24
|
||||||
|
│ ├── Ollama Dev: 10.100.30.10
|
||||||
|
│ └── MCP Server Dev: 10.100.30.20
|
||||||
|
│
|
||||||
|
├── Monitoring Zone: 10.100.40.0/24
|
||||||
|
│ ├── Prometheus Dev: 10.100.40.10
|
||||||
|
│ ├── Grafana Dev: 10.100.40.20
|
||||||
|
│ └── Loki Dev: 10.100.40.30
|
||||||
|
│
|
||||||
|
└── Data Zone: 10.100.50.0/24
|
||||||
|
├── PostgreSQL: 10.100.50.10
|
||||||
|
└── Storage: 10.100.50.20
|
||||||
|
```
|
||||||
|
|
||||||
|
**Access:**
|
||||||
|
- Доступ через тот же VPN что и production (но separate subnet routing)
|
||||||
|
- Или dedicated Dev VPN (опционально)
|
||||||
|
- Jump host опционален (можно direct access для удобства dev team)
|
||||||
|
|
||||||
|
### 2.2 Simplified Architecture
|
||||||
|
|
||||||
|
**Single manager Swarm (упрощение):**
|
||||||
|
- 1 manager node вместо 3 (не нужен quorum в dev)
|
||||||
|
- 3 worker nodes (достаточно для testing HA behaviors)
|
||||||
|
|
||||||
|
**No full redundancy:**
|
||||||
|
- Single instance каждого infrastructure service
|
||||||
|
- No automated failover (можно восстановить manually)
|
||||||
|
- Simplified backup (daily вместо hourly)
|
||||||
|
|
||||||
|
**Shared infrastructure где возможно:**
|
||||||
|
- Один PostgreSQL server для всех dev databases
|
||||||
|
- Shared storage (single NFS server)
|
||||||
|
- Combined monitoring (все в одном Grafana)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Технические требования
|
||||||
|
|
||||||
|
### 3.1 Серверная инфраструктура
|
||||||
|
|
||||||
|
**Вариант A: Отдельные VM (recommended)**
|
||||||
|
|
||||||
|
| Component | Qty | CPU | RAM | Storage | Total Resources |
|
||||||
|
|-----------|-----|-----|-----|---------|-----------------|
|
||||||
|
| Gitea | 1 | 4 | 8 GB | 200 GB | 4 vCPU, 8 GB, 200 GB |
|
||||||
|
| Jenkins | 1 | 8 | 16 GB | 500 GB | 8 vCPU, 16 GB, 500 GB |
|
||||||
|
| Harbor | 1 | 4 | 8 GB | 2 TB | 4 vCPU, 8 GB, 2 TB |
|
||||||
|
| Swarm Manager | 1 | 4 | 8 GB | 100 GB | 4 vCPU, 8 GB, 100 GB |
|
||||||
|
| Swarm Workers | 3 | 8 | 16 GB | 200 GB | 24 vCPU, 48 GB, 600 GB |
|
||||||
|
| GitOps/Portainer | 1 | 2 | 4 GB | 50 GB | 2 vCPU, 4 GB, 50 GB |
|
||||||
|
| Ollama | 1 | 8 | 32 GB | 500 GB | 8 vCPU, 32 GB, 500 GB |
|
||||||
|
| MCP Server | 1 | 4 | 8 GB | 50 GB | 4 vCPU, 8 GB, 50 GB |
|
||||||
|
| Monitoring | 1 | 8 | 16 GB | 1 TB | 8 vCPU, 16 GB, 1 TB |
|
||||||
|
| PostgreSQL | 1 | 4 | 8 GB | 200 GB | 4 vCPU, 8 GB, 200 GB |
|
||||||
|
| Storage/Backup | 1 | 2 | 8 GB | 5 TB | 2 vCPU, 8 GB, 5 TB |
|
||||||
|
| **TOTAL** | **12 VMs** | **72 vCPU** | **168 GB** | **~10 TB** | - |
|
||||||
|
|
||||||
|
**Вариант B: Single powerful server (budget option)**
|
||||||
|
|
||||||
|
Если бюджет ограничен, можно развернуть все на одном мощном сервере:
|
||||||
|
|
||||||
|
| Component | Specification |
|
||||||
|
|-----------|--------------|
|
||||||
|
| **CPU** | 80 vCPU |
|
||||||
|
| **RAM** | 256 GB |
|
||||||
|
| **Disk 1** | 2 TB NVMe SSD (OS, apps, databases) |
|
||||||
|
| **Disk 2** | 10 TB HDD RAID 10 (storage, backups) |
|
||||||
|
| **Network** | 2x 10 Gbps (bonded) |
|
||||||
|
|
||||||
|
Все компоненты как VM на этом single host (используя KVM/Proxmox).
|
||||||
|
|
||||||
|
**Pros:** Экономия costs, проще management
|
||||||
|
**Cons:** Single point of failure (ok для dev), limited scale
|
||||||
|
|
||||||
|
### 3.2 Network Infrastructure
|
||||||
|
|
||||||
|
**Minimum requirements:**
|
||||||
|
- 1 Gbps switch с VLAN support
|
||||||
|
- Firewall с routing между VLANs (может быть virtual/software)
|
||||||
|
- VPN gateway (shared с production или dedicated)
|
||||||
|
|
||||||
|
**Recommended:**
|
||||||
|
- 10 Gbps switch для лучшей производительности
|
||||||
|
- Separate internet connection (чтобы dev experiments не влияли на production traffic)
|
||||||
|
|
||||||
|
### 3.3 Storage Infrastructure
|
||||||
|
|
||||||
|
**Local storage:**
|
||||||
|
- Fast SSD для OS и applications
|
||||||
|
- HDD для Harbor images и backups
|
||||||
|
|
||||||
|
**Shared storage:**
|
||||||
|
- Simple NFS server sufficient (не нужен GlusterFS replication в dev)
|
||||||
|
- 5 TB capacity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. План развертывания
|
||||||
|
|
||||||
|
### 4.1 Phase 1: Base Infrastructure (Week 1)
|
||||||
|
|
||||||
|
**Day 1-2: Network Setup**
|
||||||
|
- Configure VLANs
|
||||||
|
- Setup firewall rules
|
||||||
|
- Configure VPN access
|
||||||
|
- DNS entries для dev services
|
||||||
|
|
||||||
|
**Day 3-4: Server Provisioning**
|
||||||
|
- Deploy VM или prepare physical servers
|
||||||
|
- Install OS (Ubuntu 22.04 LTS)
|
||||||
|
- Basic hardening
|
||||||
|
- Network configuration
|
||||||
|
|
||||||
|
**Day 5: Base Services**
|
||||||
|
- PostgreSQL installation и setup
|
||||||
|
- NFS storage setup
|
||||||
|
- Monitoring agents deployment
|
||||||
|
|
||||||
|
### 4.2 Phase 2: Core Services (Week 2)
|
||||||
|
|
||||||
|
**Day 1-2: Source Control**
|
||||||
|
- Deploy Gitea
|
||||||
|
- Configure PostgreSQL database
|
||||||
|
- Setup LDAP integration (если используется)
|
||||||
|
- Create initial repositories structure
|
||||||
|
- Import existing docs если есть
|
||||||
|
|
||||||
|
**Day 3-4: CI/CD Foundation**
|
||||||
|
- Deploy Jenkins
|
||||||
|
- Install essential plugins
|
||||||
|
- Configure Gitea webhook integration
|
||||||
|
- Setup first sample pipeline
|
||||||
|
- Test build process
|
||||||
|
|
||||||
|
**Day 5: Container Registry**
|
||||||
|
- Deploy Harbor
|
||||||
|
- Configure storage backend
|
||||||
|
- Enable vulnerability scanning
|
||||||
|
- Setup replication (если есть secondary Harbor)
|
||||||
|
- Test image push/pull
|
||||||
|
|
||||||
|
### 4.3 Phase 3: Orchestration (Week 3)
|
||||||
|
|
||||||
|
**Day 1-2: Docker Swarm Setup**
|
||||||
|
- Initialize Swarm на manager node
|
||||||
|
- Join worker nodes
|
||||||
|
- Configure overlay networks
|
||||||
|
- Setup secrets management
|
||||||
|
- Deploy test stack
|
||||||
|
|
||||||
|
**Day 3: GitOps Automation**
|
||||||
|
- Deploy GitOps Operator
|
||||||
|
- Configure Git polling
|
||||||
|
- Test automated deployment
|
||||||
|
- Verify rollback functionality
|
||||||
|
|
||||||
|
**Day 4: Management UI**
|
||||||
|
- Deploy Portainer
|
||||||
|
- Connect к Swarm
|
||||||
|
- Configure RBAC
|
||||||
|
- Create user accounts
|
||||||
|
- Deploy через UI (test)
|
||||||
|
|
||||||
|
**Day 5: Integration Testing**
|
||||||
|
- End-to-end CI/CD test
|
||||||
|
- Git commit → build → push → deploy
|
||||||
|
- Verify monitoring
|
||||||
|
- Test rollback
|
||||||
|
|
||||||
|
### 4.4 Phase 4: AI Infrastructure (Week 4)
|
||||||
|
|
||||||
|
**Day 1-2: AI Server**
|
||||||
|
- Deploy Ollama server
|
||||||
|
- Download AI models (Llama 3, Qwen, etc.)
|
||||||
|
- Test inference
|
||||||
|
- Performance tuning
|
||||||
|
|
||||||
|
**Day 3-4: MCP Server**
|
||||||
|
- Deploy MCP Server
|
||||||
|
- Configure connectors (Gitea, Swarm, DB)
|
||||||
|
- Test data access
|
||||||
|
- Integration с Ollama
|
||||||
|
|
||||||
|
**Day 5: AI Integration Testing**
|
||||||
|
- End-to-end AI workflow test
|
||||||
|
- Query documentation через AI
|
||||||
|
- Analyze logs через AI
|
||||||
|
- Generate code examples
|
||||||
|
|
||||||
|
### 4.5 Phase 5: Monitoring & Documentation (Week 5)
|
||||||
|
|
||||||
|
**Day 1-2: Monitoring Stack**
|
||||||
|
- Deploy Prometheus
|
||||||
|
- Deploy Grafana
|
||||||
|
- Deploy Loki
|
||||||
|
- Configure dashboards
|
||||||
|
- Setup alerting rules
|
||||||
|
|
||||||
|
**Day 3-4: Documentation**
|
||||||
|
- Create detailed runbooks
|
||||||
|
- Document all procedures
|
||||||
|
- Record configuration details
|
||||||
|
- Create architecture diagrams
|
||||||
|
- Write troubleshooting guides
|
||||||
|
|
||||||
|
**Day 5: Team Training**
|
||||||
|
- Walkthrough всех компонентов
|
||||||
|
- Hands-on exercises
|
||||||
|
- Q&A session
|
||||||
|
- Access provisioning
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Тестирование и валидация
|
||||||
|
|
||||||
|
### 5.1 Functional Testing
|
||||||
|
|
||||||
|
**Git Operations:**
|
||||||
|
- Clone repositories
|
||||||
|
- Push commits
|
||||||
|
- Create Pull Requests
|
||||||
|
- Merge workflows
|
||||||
|
- Webhook triggers
|
||||||
|
|
||||||
|
**CI Pipeline:**
|
||||||
|
- Build applications (multiple languages)
|
||||||
|
- Run tests (unit, integration)
|
||||||
|
- Security scanning
|
||||||
|
- Docker image creation
|
||||||
|
- Push к Harbor
|
||||||
|
|
||||||
|
**CD Process:**
|
||||||
|
- Automated deployment
|
||||||
|
- Manual deployment через Portainer
|
||||||
|
- Service scaling
|
||||||
|
- Rolling updates
|
||||||
|
- Rollback operations
|
||||||
|
|
||||||
|
**Monitoring:**
|
||||||
|
- Metrics collection
|
||||||
|
- Log aggregation
|
||||||
|
- Alert triggering
|
||||||
|
- Dashboard visualization
|
||||||
|
|
||||||
|
**AI Capabilities:**
|
||||||
|
- Query documentation
|
||||||
|
- Analyze logs
|
||||||
|
- Code generation
|
||||||
|
- Troubleshooting assistance
|
||||||
|
|
||||||
|
### 5.2 Performance Testing
|
||||||
|
|
||||||
|
**Load Testing:**
|
||||||
|
- Multiple concurrent builds в Jenkins
|
||||||
|
- High-frequency deployments
|
||||||
|
- Large image pushes к Harbor
|
||||||
|
- Monitoring system под нагрузкой
|
||||||
|
|
||||||
|
**Capacity Planning:**
|
||||||
|
- Resource utilization measurement
|
||||||
|
- Identify bottlenecks
|
||||||
|
- Determine scaling needs for production
|
||||||
|
|
||||||
|
### 5.3 Security Testing
|
||||||
|
|
||||||
|
**Vulnerability Scanning:**
|
||||||
|
- Container images
|
||||||
|
- Infrastructure components
|
||||||
|
- Dependencies
|
||||||
|
|
||||||
|
**Penetration Testing:**
|
||||||
|
- Network security
|
||||||
|
- Access controls
|
||||||
|
- Authentication mechanisms
|
||||||
|
|
||||||
|
**Compliance Validation:**
|
||||||
|
- Audit logging working
|
||||||
|
- Data encryption verified
|
||||||
|
- Access controls enforced
|
||||||
|
|
||||||
|
### 5.4 Disaster Recovery Testing
|
||||||
|
|
||||||
|
**Backup/Restore:**
|
||||||
|
- Database backup и restore
|
||||||
|
- Git repository backup и restore
|
||||||
|
- Configuration backup
|
||||||
|
- Full system restore
|
||||||
|
|
||||||
|
**Failover Scenarios:**
|
||||||
|
- Service failures
|
||||||
|
- Node failures
|
||||||
|
- Network partitions
|
||||||
|
- Data corruption
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Переход к Production
|
||||||
|
|
||||||
|
### 6.1 Lessons Learned от Dev
|
||||||
|
|
||||||
|
**Документировать:**
|
||||||
|
- Все проблемы encountered
|
||||||
|
- Solutions и workarounds
|
||||||
|
- Performance bottlenecks
|
||||||
|
- Configuration optimizations
|
||||||
|
- Team feedback
|
||||||
|
|
||||||
|
**Updates для Production:**
|
||||||
|
- Refined architecture
|
||||||
|
- Optimized configurations
|
||||||
|
- Improved procedures
|
||||||
|
- Better sizing estimates
|
||||||
|
- Updated documentation
|
||||||
|
|
||||||
|
### 6.2 Production Readiness Checklist
|
||||||
|
|
||||||
|
**Infrastructure:**
|
||||||
|
- [ ] All servers provisioned согласно specs
|
||||||
|
- [ ] Network configured с proper segmentation
|
||||||
|
- [ ] Firewall rules implemented и tested
|
||||||
|
- [ ] VPN access configured
|
||||||
|
- [ ] Monitoring fully deployed
|
||||||
|
|
||||||
|
**Services:**
|
||||||
|
- [ ] All components deployed
|
||||||
|
- [ ] High availability configured
|
||||||
|
- [ ] Backup systems operational
|
||||||
|
- [ ] Disaster recovery tested
|
||||||
|
- [ ] Security hardening completed
|
||||||
|
|
||||||
|
**Processes:**
|
||||||
|
- [ ] CI/CD pipelines validated
|
||||||
|
- [ ] GitOps workflows tested
|
||||||
|
- [ ] Incident response procedures documented
|
||||||
|
- [ ] Escalation paths defined
|
||||||
|
- [ ] On-call rotation established
|
||||||
|
|
||||||
|
**Security:**
|
||||||
|
- [ ] Vulnerability scans completed
|
||||||
|
- [ ] Penetration testing passed
|
||||||
|
- [ ] Compliance requirements met
|
||||||
|
- [ ] Audit logging verified
|
||||||
|
- [ ] Access controls implemented
|
||||||
|
|
||||||
|
**Documentation:**
|
||||||
|
- [ ] Architecture documented
|
||||||
|
- [ ] Runbooks created
|
||||||
|
- [ ] Troubleshooting guides written
|
||||||
|
- [ ] Contact lists updated
|
||||||
|
- [ ] Training materials prepared
|
||||||
|
|
||||||
|
**Team:**
|
||||||
|
- [ ] Training completed
|
||||||
|
- [ ] Access provisioned
|
||||||
|
- [ ] Roles и responsibilities defined
|
||||||
|
- [ ] Communication channels established
|
||||||
|
- [ ] Support procedures understood
|
||||||
|
|
||||||
|
### 6.3 Migration Strategy
|
||||||
|
|
||||||
|
**Phased Approach:**
|
||||||
|
|
||||||
|
**Phase 1: Pilot (1-2 weeks)**
|
||||||
|
- Migrate 1-2 non-critical applications
|
||||||
|
- Test full workflow в production
|
||||||
|
- Gather feedback
|
||||||
|
- Refine processes
|
||||||
|
|
||||||
|
**Phase 2: Gradual Migration (1-2 months)**
|
||||||
|
- Migrate applications in batches
|
||||||
|
- 3-5 applications per week
|
||||||
|
- Monitor closely
|
||||||
|
- Address issues quickly
|
||||||
|
|
||||||
|
**Phase 3: Full Production (ongoing)**
|
||||||
|
- All new applications use GitOps
|
||||||
|
- Legacy applications migrated over time
|
||||||
|
- Continuous improvement
|
||||||
|
- Regular reviews
|
||||||
|
|
||||||
|
**Rollback Plan:**
|
||||||
|
- Keep legacy deployment process operational в параллель
|
||||||
|
- Document rollback procedures
|
||||||
|
- Test rollback scenarios
|
||||||
|
- Clear decision criteria для rollback
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Success Criteria:**
|
||||||
|
|
||||||
|
Dev environment считается успешным когда:
|
||||||
|
1. Все компоненты deployed и operational
|
||||||
|
2. End-to-end CI/CD workflow работает
|
||||||
|
3. Team trained и comfortable с инструментами
|
||||||
|
4. Documentation complete и accurate
|
||||||
|
5. Production deployment plan validated
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Sign-off:**
|
||||||
|
- DevOps Lead: _______________
|
||||||
|
- Development Lead: _______________
|
||||||
|
- Infrastructure Lead: _______________
|
||||||
|
- Date: _______________
|
||||||
Reference in New Issue
Block a user