admin/k3s-gitops

Fork 0

Files

Claude AI 97699c22a4 docs: add comprehensive development environment guide

2026-01-12 13:06:58 +00:00

14 KiB

Raw Blame History

FinTech GitOps CI/CD - Development Environment

Версия: 1.0
Дата: Январь 2026
Целевая аудитория: DevOps Team, Infrastructure, Development Team

1. Назначение Dev Environment

1.1 Зачем нужен отдельный Dev Environment

Безопасность:

Тестирование новых компонентов без риска для production
Эксперименты с конфигурациями
Обучение команды на безопасной среде
Валидация security политик перед production

Проверка интеграций:

Тестирование CI/CD pipelines
Валидация GitOps workflows
Проверка backup/restore процедур
Тестирование disaster recovery scenarios

Разработка и отладка:

Development приложений в production-like окружении
Debugging проблем без impact на production
Performance testing и tuning
Capacity planning и load testing

Обучение команды:

Hands-on тренинг на реальной инфраструктуре
Практика troubleshooting
Изучение новых инструментов
Onboarding новых сотрудников

1.2 Отличия от Production

Масштаб:

Меньше ресурсов (~40% от production)
Меньше replicas для services
Shorter retention periods для данных
Simplified HA (не обязательна полная redundancy)

Данные:

Synthetic/mock данные (НЕ production data)
Anonymized копии production data где необходимо
Меньшие dataset sizes
Shorter retention

Availability:

SLA не критичны (допустимы downtime для maintenance)
Может быть выключен в нерабочее время
Scheduled maintenance windows без согласования

Security:

Менее строгие access controls (больше людей имеют доступ)
Simplified authentication (можно без MFA для dev team)
Relaxed network policies (для удобства debugging)
НО: все равно следуем основным security practices

2. Архитектура Dev окружения

2.1 Network Layout

Separate VLAN от Production:

Dev Environment VLAN: 10.100.0.0/16

Зоны (подсети):
├── Management Zone: 10.100.10.0/24
│   ├── Gitea Dev: 10.100.10.10
│   ├── Jenkins Dev: 10.100.10.20
│   ├── Harbor Dev: 10.100.10.30
│   ├── GitOps Operator Dev: 10.100.10.40
│   └── Portainer Dev: 10.100.10.50
│
├── Swarm Cluster Zone: 10.100.20.0/24
│   ├── Manager: 10.100.20.1
│   └── Workers: 10.100.20.2-4 (3 workers)
│
├── AI Zone: 10.100.30.0/24
│   ├── Ollama Dev: 10.100.30.10
│   └── MCP Server Dev: 10.100.30.20
│
├── Monitoring Zone: 10.100.40.0/24
│   ├── Prometheus Dev: 10.100.40.10
│   ├── Grafana Dev: 10.100.40.20
│   └── Loki Dev: 10.100.40.30
│
└── Data Zone: 10.100.50.0/24
    ├── PostgreSQL: 10.100.50.10
    └── Storage: 10.100.50.20

Access:

Доступ через тот же VPN что и production (но separate subnet routing)
Или dedicated Dev VPN (опционально)
Jump host опционален (можно direct access для удобства dev team)

2.2 Simplified Architecture

Single manager Swarm (упрощение):

1 manager node вместо 3 (не нужен quorum в dev)
3 worker nodes (достаточно для testing HA behaviors)

No full redundancy:

Single instance каждого infrastructure service
No automated failover (можно восстановить manually)
Simplified backup (daily вместо hourly)

Shared infrastructure где возможно:

Один PostgreSQL server для всех dev databases
Shared storage (single NFS server)
Combined monitoring (все в одном Grafana)

3. Технические требования

3.1 Серверная инфраструктура

Вариант A: Отдельные VM (recommended)

Component	Qty	CPU	RAM	Storage	Total Resources
Gitea	1	4	8 GB	200 GB	4 vCPU, 8 GB, 200 GB
Jenkins	1	8	16 GB	500 GB	8 vCPU, 16 GB, 500 GB
Harbor	1	4	8 GB	2 TB	4 vCPU, 8 GB, 2 TB
Swarm Manager	1	4	8 GB	100 GB	4 vCPU, 8 GB, 100 GB
Swarm Workers	3	8	16 GB	200 GB	24 vCPU, 48 GB, 600 GB
GitOps/Portainer	1	2	4 GB	50 GB	2 vCPU, 4 GB, 50 GB
Ollama	1	8	32 GB	500 GB	8 vCPU, 32 GB, 500 GB
MCP Server	1	4	8 GB	50 GB	4 vCPU, 8 GB, 50 GB
Monitoring	1	8	16 GB	1 TB	8 vCPU, 16 GB, 1 TB
PostgreSQL	1	4	8 GB	200 GB	4 vCPU, 8 GB, 200 GB
Storage/Backup	1	2	8 GB	5 TB	2 vCPU, 8 GB, 5 TB
TOTAL	12 VMs	72 vCPU	168 GB	~10 TB	-

Вариант B: Single powerful server (budget option)

Если бюджет ограничен, можно развернуть все на одном мощном сервере:

Component	Specification
CPU	80 vCPU
RAM	256 GB
Disk 1	2 TB NVMe SSD (OS, apps, databases)
Disk 2	10 TB HDD RAID 10 (storage, backups)
Network	2x 10 Gbps (bonded)

Все компоненты как VM на этом single host (используя KVM/Proxmox).

Pros: Экономия costs, проще management Cons: Single point of failure (ok для dev), limited scale

3.2 Network Infrastructure

Minimum requirements:

1 Gbps switch с VLAN support
Firewall с routing между VLANs (может быть virtual/software)
VPN gateway (shared с production или dedicated)

Recommended:

10 Gbps switch для лучшей производительности
Separate internet connection (чтобы dev experiments не влияли на production traffic)

3.3 Storage Infrastructure

Local storage:

Fast SSD для OS и applications
HDD для Harbor images и backups

Shared storage:

Simple NFS server sufficient (не нужен GlusterFS replication в dev)
5 TB capacity

4. План развертывания

4.1 Phase 1: Base Infrastructure (Week 1)

Day 1-2: Network Setup

Configure VLANs
Setup firewall rules
Configure VPN access
DNS entries для dev services

Day 3-4: Server Provisioning

Deploy VM или prepare physical servers
Install OS (Ubuntu 22.04 LTS)
Basic hardening
Network configuration

Day 5: Base Services

PostgreSQL installation и setup
NFS storage setup
Monitoring agents deployment

4.2 Phase 2: Core Services (Week 2)

Day 1-2: Source Control

Deploy Gitea
Configure PostgreSQL database
Setup LDAP integration (если используется)
Create initial repositories structure
Import existing docs если есть

Day 3-4: CI/CD Foundation

Deploy Jenkins
Install essential plugins
Configure Gitea webhook integration
Setup first sample pipeline
Test build process

Day 5: Container Registry

Deploy Harbor
Configure storage backend
Enable vulnerability scanning
Setup replication (если есть secondary Harbor)
Test image push/pull

4.3 Phase 3: Orchestration (Week 3)

Day 1-2: Docker Swarm Setup

Initialize Swarm на manager node
Join worker nodes
Configure overlay networks
Setup secrets management
Deploy test stack

Day 3: GitOps Automation

Deploy GitOps Operator
Configure Git polling
Test automated deployment
Verify rollback functionality

Day 4: Management UI

Deploy Portainer
Connect к Swarm
Configure RBAC
Create user accounts
Deploy через UI (test)

Day 5: Integration Testing

End-to-end CI/CD test
Git commit → build → push → deploy
Verify monitoring
Test rollback

4.4 Phase 4: AI Infrastructure (Week 4)

Day 1-2: AI Server

Deploy Ollama server
Download AI models (Llama 3, Qwen, etc.)
Test inference
Performance tuning

Day 3-4: MCP Server

Deploy MCP Server
Configure connectors (Gitea, Swarm, DB)
Test data access
Integration с Ollama

Day 5: AI Integration Testing

End-to-end AI workflow test
Query documentation через AI
Analyze logs через AI
Generate code examples

4.5 Phase 5: Monitoring & Documentation (Week 5)

Day 1-2: Monitoring Stack

Deploy Prometheus
Deploy Grafana
Deploy Loki
Configure dashboards
Setup alerting rules

Day 3-4: Documentation

Create detailed runbooks
Document all procedures
Record configuration details
Create architecture diagrams
Write troubleshooting guides

Day 5: Team Training

Walkthrough всех компонентов
Hands-on exercises
Q&A session
Access provisioning

5. Тестирование и валидация

5.1 Functional Testing

Git Operations:

Clone repositories
Push commits
Create Pull Requests
Merge workflows
Webhook triggers

CI Pipeline:

Build applications (multiple languages)
Run tests (unit, integration)
Security scanning
Docker image creation
Push к Harbor

CD Process:

Automated deployment
Manual deployment через Portainer
Service scaling
Rolling updates
Rollback operations

Monitoring:

Metrics collection
Log aggregation
Alert triggering
Dashboard visualization

AI Capabilities:

Query documentation
Analyze logs
Code generation
Troubleshooting assistance

5.2 Performance Testing

Load Testing:

Multiple concurrent builds в Jenkins
High-frequency deployments
Large image pushes к Harbor
Monitoring system под нагрузкой

Capacity Planning:

Resource utilization measurement
Identify bottlenecks
Determine scaling needs for production

5.3 Security Testing

Vulnerability Scanning:

Container images
Infrastructure components
Dependencies

Penetration Testing:

Network security
Access controls
Authentication mechanisms

Compliance Validation:

Audit logging working
Data encryption verified
Access controls enforced

5.4 Disaster Recovery Testing

Backup/Restore:

Database backup и restore
Git repository backup и restore
Configuration backup
Full system restore

Failover Scenarios:

Service failures
Node failures
Network partitions
Data corruption

6. Переход к Production

6.1 Lessons Learned от Dev

Документировать:

Все проблемы encountered
Solutions и workarounds
Performance bottlenecks
Configuration optimizations
Team feedback

Updates для Production:

Refined architecture
Optimized configurations
Improved procedures
Better sizing estimates
Updated documentation

6.2 Production Readiness Checklist

Infrastructure:

All servers provisioned согласно specs
Network configured с proper segmentation
Firewall rules implemented и tested
VPN access configured
Monitoring fully deployed

Services:

All components deployed
High availability configured
Backup systems operational
Disaster recovery tested
Security hardening completed

Processes:

CI/CD pipelines validated
GitOps workflows tested
Incident response procedures documented
Escalation paths defined
On-call rotation established