Enterprise GenAI Integration

xOps implements a 6-layer enterprise GenAI architecture. Each layer maps to specific ADLC agents, commands, skills, and hooks.

Architecture Overview

L1: Customer          → CloudOps/FinOps/DevOps engineer
L2: Interaction       → Open WebUI RAG chatbot + runbooks CLI
L3: Generative AI     → CrewAI + LiteLLM + Claude API
L4: Backend Apps      → IAM Identity Center + ADLC hooks + runbooks CLI
L5: Data Sources      → Config Aggregator + Cost Explorer + Security Hub
L6: Infrastructure    → Docker ($0) → ECS Fargate ($180/mo) → K3S hybrid

Layer 1 — Customer

Industry Pattern: User logs in, reviews options, completes task or escalates to live agent

xOps Pattern: CloudOps/FinOps/DevOps engineer opens xOps, asks operational question, reviews AI analysis, approves remediation or escalates to HITL manager

Step	Action	System
Engineer opens xOps	SSO login via IAM Identity Center	Open WebUI
Asks operational question	Natural language: "Show me idle EC2 across all accounts"	Chat Interface
Reviews AI analysis	Multi-account dashboard with cost/security/inventory signals	Rich CLI + Tables
Requests HITL escalation	Complex remediation needs human approval (Principle I)	SNS + Slack
Completes task	Exports report, closes ticket, evidence in tmp/	CSV/JSON/PDF

ADLC Agent: product-owner — validates business value at the customer touchpoint

Commands: /product:pr-faq xops, /metrics:daily-standup

Layer 2 — Interaction

Industry Pattern: Chatbot initiates and guides conversation; agent escalation when AI cannot resolve

xOps Pattern: Open WebUI RAG chatbot + runbooks CLI handle 80% of queries; CrewAI crews escalate complex multi-step operations to HITL

Step	Action	System
Chatbot activated	xOps RAG pipeline processes query against multi-account AWS knowledge base	Open WebUI + ChromaDB
Communicates options	Presents cost analysis, security findings, or inventory results with confidence scores	CrewAI Flows v2
User selects action	Approve recommendation or request alternatives	Pipeline Engine
Chatbot responds	Executes approved action via runbooks CLI (READONLY)	runbooks PyPI
Pings HITL support	Escalation when action requires write access or compliance approval	SNS → Slack
HITL provides solution	Manager reviews evidence, approves terraform apply or remediation	ADLC Phase 3+
Feedback to model	HITL correction feeds back to RAG knowledge base for continuous improvement	CrewAI Knowledge

ADLC Agents: frontend-docs-engineer (Open WebUI), meta-engineering-expert (pipeline engine)

Layer 3 — Generative AI

Industry Pattern: Model receives request, checks policies, generates options, executes next steps

xOps Pattern: CrewAI orchestrates specialist crews (CloudOps/FinOps/DevSecOps) via LiteLLM gateway to Claude API with ADLC governance hooks

Step	Action	System
Model receives request	CrewAI Flow receives user query + pulls AWS account context via MCP	CrewAI + LiteLLM
Checks compliance policy	APRA CPS 234 data sovereignty check — sensitive queries route to Ollama (local), not cloud	LiteLLM Router
Generates analysis	Multi-account cost analysis, security posture, inventory discovery	runbooks analyzers (119+)
Presents options	Ranked recommendations with cost impact, risk score, and compliance mapping	Claude API + Prompt Caching
Executes approved action	READONLY queries autonomous; write operations require HITL gate (Principle I)	ADLC Hooks

ADLC Agents: meta-engineering-expert (CrewAI flows), python-engineer (analyzers), qa-engineer (accuracy validation), cloud-architect (AI architecture decisions)

Layer 4 — Backend Apps

Industry Pattern: Authentication, policy enforcement, booking workflows, agent assignment

xOps Pattern: IAM Identity Center SSO, ADLC constitutional governance (22 hooks), runbooks CLI (131 commands), MCP server orchestration (58 integrations)

Step	Action	System
Authentication + authorisation	IAM Identity Center SCIM 2.0 → Open WebUI user sync + OIDC for API auth	M1 terraform-aws-iam-identity-center
Policy enforcement	ADLC 22 deterministic hooks: enforce-coordination, validate-bash, detect-nato-violation	.claude/hooks/scripts/
Workflow management	6-phase ADLC lifecycle: PLAN → BUILD → TEST → DEPLOY → MONITOR → OPERATE	ADLC Constitution v2.1.0
Agent assignment	15 constitutional agents in 3 tiers: 3 opus (decision) + 8 sonnet (execution) + 4 haiku (operations)	.claude/agents/

ADLC Agents: infrastructure-engineer (IaC lifecycle), security-compliance-engineer (APRA CPS 234)

Layer 5 — Data Sources

Industry Pattern: Customer ID, booking history, policy rules, agent directories

xOps Pattern: AWS account inventory (Config Aggregator), cost history (Cost Explorer FOCUS 1.2+), security findings (Security Hub), certificate inventory (ACM org-wide)

Step	Action	System
Account inventory	Config Aggregator: org-wide resource discovery in seconds (P1 path)	AWS Config
Cost history	Cost Explorer: monthly/daily granularity, FOCUS 1.2+ normalisation, multi-account	AWS Cost Explorer
Security findings	Security Hub: aggregated CRITICAL/HIGH/MEDIUM across all accounts, SOC2 mapping	AWS Security Hub
Certificate inventory	ACM: risk-ranked expiry dashboard across all accounts	AWS ACM + IAM
Operational knowledge	ChromaDB RAG: runbooks docs, SOPs, incident history (xOps-S2 roadmap — S2-01)	CrewAI Knowledge

ADLC Agents: cloud-architect (data architecture), observability-engineer (MELT telemetry instrumentation)

Layer 6 — Infrastructure

Industry Pattern: Cloud/hybrid infrastructure, model orchestration, low-latency, security governance

xOps Pattern: Docker-first ($0 local) → ECS Fargate Graviton4 ($180/mo prod), LiteLLM model orchestration, CloudFront edge delivery, ADLC + WAFv2 security governance

Step	Action	System
Cloud / on-premises	LOCAL: docker-compose ($0) → PROD: ECS Fargate Graviton4 ARM64 ($180/mo) → HYBRID: K3S for on-prem sovereignty	M2 terraform-aws-ecs
Model orchestration	LiteLLM gateway: Claude API (BC1) → Bedrock VPC (BC2+) → Ollama local (air-gapped). Config change, not arch change.	LiteLLM + CrewAI
Low-latency delivery	CloudFront 450+ PoPs, WebSocket passthrough, ALB sticky sessions for streaming	M3 terraform-aws-web
Security governance	WAFv2 ATPRuleSet + ADLC 22 hooks + 65 anti-patterns + 17 rules files + APRA CPS 234	ADLC Framework v3.7.2

ADLC Agents: infrastructure-engineer (deployment), cloud-architect (infra design), sre-automation-specialist (reliability)

Agent-to-Layer Mapping

Layer	Agents	Why
Customer	product-owner	Validates business value — ensures AI serves real operator needs
Interaction	frontend-docs-engineer, meta-engineering-expert	Frontend expertise for Open WebUI + MEE for pipeline engine integration
Generative AI	meta-engineering-expert, python-engineer, qa-engineer, cloud-architect	MEE designs CrewAI flows; python-engineer implements analyzers; QA validates accuracy; CA validates AI architecture
Backend Apps	infrastructure-engineer, security-compliance-engineer	IaC lifecycle + APRA CPS 234 compliance at every backend touchpoint
Data Sources	cloud-architect, observability-engineer	CA designs data architecture; observability-engineer instruments MELT telemetry
Infrastructure	infrastructure-engineer, cloud-architect, sre-automation-specialist	Infra-engineer deploys; CA designs infra architecture; SRE ensures reliability

Architecture Overview​

Layer 1 — Customer​

Layer 2 — Interaction​

Layer 3 — Generative AI​

Layer 4 — Backend Apps​

Layer 5 — Data Sources​

Layer 6 — Infrastructure​

Agent-to-Layer Mapping​