Enterprise GenAI Integration
xOps implements a 6-layer enterprise GenAI architecture. Each layer maps to specific ADLC agents, commands, skills, and hooks.
Architecture Overview
L1: Customer → CloudOps/FinOps/DevOps engineer
L2: Interaction → Open WebUI RAG chatbot + runbooks CLI
L3: Generative AI → CrewAI + LiteLLM + Claude API
L4: Backend Apps → IAM Identity Center + ADLC hooks + runbooks CLI
L5: Data Sources → Config Aggregator + Cost Explorer + Security Hub
L6: Infrastructure → Docker ($0) → ECS Fargate ($180/mo) → K3S hybrid
Layer 1 — Customer
Industry Pattern: User logs in, reviews options, completes task or escalates to live agent
xOps Pattern: CloudOps/FinOps/DevOps engineer opens xOps, asks operational question, reviews AI analysis, approves remediation or escalates to HITL manager
| Step | Action | System |
|---|---|---|
| Engineer opens xOps | SSO login via IAM Identity Center | Open WebUI |
| Asks operational question | Natural language: "Show me idle EC2 across all accounts" | Chat Interface |
| Reviews AI analysis | Multi-account dashboard with cost/security/inventory signals | Rich CLI + Tables |
| Requests HITL escalation | Complex remediation needs human approval (Principle I) | SNS + Slack |
| Completes task | Exports report, closes ticket, evidence in tmp/ | CSV/JSON/PDF |
ADLC Agent: product-owner — validates business value at the customer touchpoint
Commands: /product:pr-faq xops, /metrics:daily-standup
Layer 2 — Interaction
Industry Pattern: Chatbot initiates and guides conversation; agent escalation when AI cannot resolve
xOps Pattern: Open WebUI RAG chatbot + runbooks CLI handle 80% of queries; CrewAI crews escalate complex multi-step operations to HITL
| Step | Action | System |
|---|---|---|
| Chatbot activated | xOps RAG pipeline processes query against multi-account AWS knowledge base | Open WebUI + ChromaDB |
| Communicates options | Presents cost analysis, security findings, or inventory results with confidence scores | CrewAI Flows v2 |
| User selects action | Approve recommendation or request alternatives | Pipeline Engine |
| Chatbot responds | Executes approved action via runbooks CLI (READONLY) | runbooks PyPI |
| Pings HITL support | Escalation when action requires write access or compliance approval | SNS → Slack |
| HITL provides solution | Manager reviews evidence, approves terraform apply or remediation | ADLC Phase 3+ |
| Feedback to model | HITL correction feeds back to RAG knowledge base for continuous improvement | CrewAI Knowledge |
ADLC Agents: frontend-docs-engineer (Open WebUI), meta-engineering-expert (pipeline engine)
Layer 3 — Generative AI
Industry Pattern: Model receives request, checks policies, generates options, executes next steps
xOps Pattern: CrewAI orchestrates specialist crews (CloudOps/FinOps/DevSecOps) via LiteLLM gateway to Claude API with ADLC governance hooks
| Step | Action | System |
|---|---|---|
| Model receives request | CrewAI Flow receives user query + pulls AWS account context via MCP | CrewAI + LiteLLM |
| Checks compliance policy | APRA CPS 234 data sovereignty check — sensitive queries route to Ollama (local), not cloud | LiteLLM Router |
| Generates analysis | Multi-account cost analysis, security posture, inventory discovery | runbooks analyzers (119+) |
| Presents options | Ranked recommendations with cost impact, risk score, and compliance mapping | Claude API + Prompt Caching |
| Executes approved action | READONLY queries autonomous; write operations require HITL gate (Principle I) | ADLC Hooks |
ADLC Agents: meta-engineering-expert (CrewAI flows), python-engineer (analyzers), qa-engineer (accuracy validation), cloud-architect (AI architecture decisions)
Layer 4 — Backend Apps
Industry Pattern: Authentication, policy enforcement, booking workflows, agent assignment
xOps Pattern: IAM Identity Center SSO, ADLC constitutional governance (22 hooks), runbooks CLI (131 commands), MCP server orchestration (58 integrations)
| Step | Action | System |
|---|---|---|
| Authentication + authorisation | IAM Identity Center SCIM 2.0 → Open WebUI user sync + OIDC for API auth | M1 terraform-aws-iam-identity-center |
| Policy enforcement | ADLC 22 deterministic hooks: enforce-coordination, validate-bash, detect-nato-violation | .claude/hooks/scripts/ |
| Workflow management | 6-phase ADLC lifecycle: PLAN → BUILD → TEST → DEPLOY → MONITOR → OPERATE | ADLC Constitution v2.1.0 |
| Agent assignment | 15 constitutional agents in 3 tiers: 3 opus (decision) + 8 sonnet (execution) + 4 haiku (operations) | .claude/agents/ |
ADLC Agents: infrastructure-engineer (IaC lifecycle), security-compliance-engineer (APRA CPS 234)
Layer 5 — Data Sources
Industry Pattern: Customer ID, booking history, policy rules, agent directories
xOps Pattern: AWS account inventory (Config Aggregator), cost history (Cost Explorer FOCUS 1.2+), security findings (Security Hub), certificate inventory (ACM org-wide)
| Step | Action | System |
|---|---|---|
| Account inventory | Config Aggregator: org-wide resource discovery in seconds (P1 path) | AWS Config |
| Cost history | Cost Explorer: monthly/daily granularity, FOCUS 1.2+ normalisation, multi-account | AWS Cost Explorer |
| Security findings | Security Hub: aggregated CRITICAL/HIGH/MEDIUM across all accounts, SOC2 mapping | AWS Security Hub |
| Certificate inventory | ACM: risk-ranked expiry dashboard across all accounts | AWS ACM + IAM |
| Operational knowledge | ChromaDB RAG: runbooks docs, SOPs, incident history (xOps-S2 roadmap — S2-01) | CrewAI Knowledge |
ADLC Agents: cloud-architect (data architecture), observability-engineer (MELT telemetry instrumentation)
Layer 6 — Infrastructure
Industry Pattern: Cloud/hybrid infrastructure, model orchestration, low-latency, security governance
xOps Pattern: Docker-first ($0 local) → ECS Fargate Graviton4 ($180/mo prod), LiteLLM model orchestration, CloudFront edge delivery, ADLC + WAFv2 security governance
| Step | Action | System |
|---|---|---|
| Cloud / on-premises | LOCAL: docker-compose ($0) → PROD: ECS Fargate Graviton4 ARM64 ($180/mo) → HYBRID: K3S for on-prem sovereignty | M2 terraform-aws-ecs |
| Model orchestration | LiteLLM gateway: Claude API (BC1) → Bedrock VPC (BC2+) → Ollama local (air-gapped). Config change, not arch change. | LiteLLM + CrewAI |
| Low-latency delivery | CloudFront 450+ PoPs, WebSocket passthrough, ALB sticky sessions for streaming | M3 terraform-aws-web |
| Security governance | WAFv2 ATPRuleSet + ADLC 22 hooks + 65 anti-patterns + 17 rules files + APRA CPS 234 | ADLC Framework v3.7.2 |
ADLC Agents: infrastructure-engineer (deployment), cloud-architect (infra design), sre-automation-specialist (reliability)
Agent-to-Layer Mapping
| Layer | Agents | Why |
|---|---|---|
| Customer | product-owner | Validates business value — ensures AI serves real operator needs |
| Interaction | frontend-docs-engineer, meta-engineering-expert | Frontend expertise for Open WebUI + MEE for pipeline engine integration |
| Generative AI | meta-engineering-expert, python-engineer, qa-engineer, cloud-architect | MEE designs CrewAI flows; python-engineer implements analyzers; QA validates accuracy; CA validates AI architecture |
| Backend Apps | infrastructure-engineer, security-compliance-engineer | IaC lifecycle + APRA CPS 234 compliance at every backend touchpoint |
| Data Sources | cloud-architect, observability-engineer | CA designs data architecture; observability-engineer instruments MELT telemetry |
| Infrastructure | infrastructure-engineer, cloud-architect, sre-automation-specialist | Infra-engineer deploys; CA designs infra architecture; SRE ensures reliability |