ADR-005: Option C Hybrid Architecture
| Field | Value |
|---|
| Status | Accepted |
| Date | 2026-03-11 |
| Decision Makers | CA (lead), PO, MEE, IE — 4-agent consensus |
Context
xOps = CloudOps + FinOps + DevOps. BC1 uses 2 ECS Fargate services (ADR-004) for AI workloads. However, the 2026-2030 enterprise trend demands capabilities beyond ECS-only:
- Local-first + hybrid-cloud: Docker + K3D locally, ECS + K3S in production
- IoT / Edge: K3S runs on ARM64 devices, Raspberry Pi, industrial gateways
- On-prem: Regulated industries (FSI, Energy) require on-premises compute
- Multi-cloud: Crossplane on K3S provisions Azure/GCP resources from a single control plane
- Air-gapped: K3S installs offline; ECS requires internet connectivity
The DevOps domain (ArgoCD, Vault, Atlantis, Crossplane) has fundamentally different operational characteristics from AI services (Open WebUI, FastAPI+CrewAI). Mixing them on one compute platform creates coupling without benefit.
Existing IaC: 161 files at DevOps-Terraform/tf-k3s — 85% ready for DevOps GitOps platform.
Decision
Two parallel streams with independent failure domains:
| Stream 1: ECS Fargate | Stream 2: K3S GitOps |
|---|
| Domain | CloudOps + FinOps (AI Services) | DevOps (GitOps Platform) |
| Services | Open WebUI, FastAPI+CrewAI | ArgoCD, Vault HA, Atlantis, Crossplane |
| Cost | $180/mo (BC1) | $0 on-prem / $120-190 cloud VMs |
| Agent | infrastructure-engineer | kubernetes-engineer |
| Local | docker-compose | K3D |
| Prod | ECS Graviton4 ARM64 | K3S 3-node HA |
| IaC | terraform-aws modules (M1-M4) | DevOps-Terraform/tf-k3s (161 files) |
Stream 2 is activated only when quantified triggers fire — not proactively. BC1 starts with Stream 1 only.
Consequences
Gains
- Hybrid-cloud: On-prem, IoT, edge, multi-cloud — all addressed by K3S
- GitOps platform: ArgoCD + Atlantis = production-grade IaC review and deployment
- ADLC Principle IV compliance: Hybrid Deployment (LocalStack + K3D + AWS)
- Independent failure domains: ECS AI services unaffected by K3S operations
- Existing IaC: 161 files at tf-k3s, 85% ready — minimal new work
Losses
- Two compute planes: Additional monitoring and operational surface
- Complexity: Kubernetes knowledge required for Stream 2
- Cost: $120-190/mo for cloud K3S VMs (on-prem = $0)
Alternatives Considered
| Option | Weighted Score | Verdict |
|---|
| A: ECS only | 83.3 | Covers BC1 CloudOps+FinOps. Cannot address on-prem/IoT/multi-cloud. |
| B: K3S only | 71.1 | Over-engineered for AI services. Missing managed ECS benefits. |
| C: Hybrid (winner) | 87.1 | ECS for AI (managed, simple) + K3S for DevOps (flexible, portable). Best of both. |
Well-Architected Scores
| Pillar | ECS Only | K3S Only | Hybrid |
|---|
| Operational Excellence | 92 | 65 | 90 |
| Security | 90 | 58 | 88 |
| Reliability | 88 | 62 | 90 |
| Performance | 85 | 60 | 92 |
| Cost Optimisation | 90 | 55 | 88 |
| Sustainability | 80 | 65 | 92 |
| Average | 88.5 | 60.3 | 89.8 |
| With 2026-2030 trend | — | — | 91.2 |
Agent Scores
| Agent | Score | Key Rationale |
|---|
| PO | 76.25% | BC1 Stream 2 adds no immediate customer value; justified as BC2+ readiness |
| CA | 91.2% | Architecturally sound; independent failure domains; tf-k3s 85% ready |
| MEE | 93.0% | kubernetes-engineer agent + K3D/K3S commands already in ADLC framework |
| IE | 87.8% | 161-file IaC exists; K3S operational model well-understood |
| Consensus | 87.1% | Architecture agreement: 100% (all agents approve Option C) |
Activation Triggers
| Trigger | Threshold | Action | Classification |
|---|
| IaC PRs | >5/week | Activate Atlantis on K3S | Service Addition |
| Team size | >3 engineers | ArgoCD + Atlantis for concurrent PR isolation | Service Addition |
| Second cloud | Azure/GCP mandate | Crossplane on K3S | Module Addition |
| On-prem/IoT | Regulatory mandate | K3S edge nodes | Architecture Change |
Cross-References
- ADR-004: 2 docker services (KISS/5S) — BC1 ECS baseline
- Evolution Architecture: Scaling classification + hybrid section
- Golden Paths: Stage 3B K3S Hybrid-Cloud path
- xops.jsx:
HYBRID_ARCH constant, LAYERS[id=2].whyNot[] K3S entry
- Coordination:
tmp/adlc-framework/coordination-logs/*-2026-03-11-docker-vs-k3s-v2.json