Skip to main content

FinOps & Analytics Lifecycle

As much as 70% of the development efforts of an AI-based solution are composed of wrangling and harmonizing data.

AI agents build governed. Humans ship trusted. 80% autonomy, 100% accountability.

Golden Path: From Raw Cost Data to Optimized Spend


Phase 1: Collect (3 min)

Who: Enterprise team collects via READONLY profiles. HITL verifies account coverage.

What: Gather cost data from AWS + Azure with FOCUS 1.2+ normalization.

Daily Normalization: MoM comparisons use parse_billing_period_to_days() for actual calendar days (28/29/30/31). Eliminates 6.7% error for February, 3.3% for 31-day months. Never hardcode /30.

Why: Data quality at the source. Portal CSV is SSOT, not API alone — prevents FINOPS_API_SSOT_MISMATCH.

Karpathy Rule 1 — Think Before Coding

Step 0 golden prompt is mandatory. Confirm persona, period, and scope BEFORE collecting data. Anti-pattern prevented: HIDDEN_ASSUMPTION.

What-if skip: Incomplete data, RBAC-scoped undercounts, wrong numbers to leadership.

How

/finops:aws-monthly     # AWS cost report with persona modes
/finops:azure-monthly # Azure cost report with 4-way validation

Output

  • FOCUS 1.2+ cost reports per cloud provider
  • Persona-mode views: CFO, CTO, CloudOps Engineer
  • CSV + JSON export for downstream analysis

Quality Gate: All accounts visible. FOCUS 1.2+ tags present.


Phase 2: Validate (15 min)

Who: qa-engineer validates accuracy. HITL reviews deltas.

What: 4-way cross-validation: CLI vs Config Aggregator vs Cost Explorer vs Console.

Why: 99.5% accuracy gate. SELF_COMPARISON_VALIDATION prevented — must use 2+ independent sources.

Karpathy Rule 4 — Goal-Driven PDCA

Success criteria = 99.5% accuracy gate. Loop validation until met. Every number cites its measurement method.

What-if skip: Bad data in reports, false savings claims, eroded stakeholder trust.

How

/inventory:lz-cross-validate   # Multi-source cross-validation
/devtools:validate # MCP server accuracy check

Output

  • Cross-validation report with per-source accuracy deltas
  • Independent source comparison (not same-process exports)
  • Variance analysis with root cause

Quality Gate: Cross-validation accuracy >=99.5%. Independent sources used.


Phase 3: Analyze (30 min)

Who: cloud-architect analyzes. product-owner validates business alignment.

What: Cost trends, anomaly detection, decommission candidate scoring.

Why: Analysis with measured data, not estimates. NO_ESTIMATED_COUNTS anti-pattern prevented.

Karpathy Rule 2 — Simplicity First

Report measured data only. No speculative projections, no estimated savings. If you can't cite the measurement method, remove the number.

What-if skip: Estimated numbers in reports, unvalidated savings claims.

How

/finops:analyze                  # Cost trends + anomaly detection
/finops:decommission-inventory # Scream-test scored candidates

Output

  • MoM cost trend analysis with anomaly flags (>20% change)
  • Decommission candidates with E1-E7 / S1-S7 signals
  • Scream-test scores (0-100, >=70 flagged)

Quality Gate: Every number cites measurement method. No estimates.


Phase 4: Report (10 min)

Who: observability-engineer generates. HITL distributes to stakeholders.

What: Persona-mode reports for CFO, CTO, CloudOps. Executive FinOps report.

Why: Right data for right audience. CFO needs cost totals, CTO needs trends, CloudOps needs actions.

Karpathy Rule 3 — Surgical Changes

Persona reports contain only what that audience needs. No drive-by content — CFO doesn't need API details, CloudOps doesn't need budget variance.

What-if skip: One-size-fits-all reports, executive disengagement.

How

/finops:report   # Executive FinOps report with persona modes

Output

  • 4 persona reports (CFO, CTO, CloudOps, FinOps)
  • Stakeholder email template
  • Evidence-backed claims (no unvalidated savings)

Quality Gate: Claims cite sources. Under 5 minute HITL review time.


Phase 5: Optimize (ongoing)

Who: cloud-architect proposes. HITL approves decommission actions.

What: Rightsize, decommission unused resources, track savings attribution.

Why: Evidence-based decommission with READONLY profiles. AWS IAM prevents mutations.

Karpathy Rule 4 — Goal-Driven PDCA

Savings must be measured post-action, not estimated pre-action. Define success criteria (scream-test score >= 70) before proposing decommission.

What-if skip: Zombie resources persist, savings potential unrealized.

How

/finops:azure-rightsizing   # Azure over-provisioned resource detection
# ec2-scream-test skill # AWS decommission feasibility scoring

Output

  • Rightsizing recommendations with cost impact
  • Decommission actions tracked with savings attribution
  • Savings evidence for FinOps reporting

Quality Gate: READONLY only. HITL approves any changes. Savings measured, not estimated.


LEAN/5S Applied to FinOps

PrincipleApplicationEvidence
SortFOCUS 1.2+ normalization across AWS + Azure/finops:aws-monthly
Set in OrderPersona modes: CFO / CTO / CloudOps / FinOps--persona flag
ShineCross-validation at 99.5% accuracy gate/inventory:lz-cross-validate
StandardizePortal CSV as SSOT, not API aloneFINOPS_API_SSOT_MISMATCH prevented
SustainMonthly reports with trend tracking/finops:report

CxO Customer Journey Map

PhaseCFO / FinanceCTO / ArchitectureCloudOps EngineerFinOps Practitioner
TriggerBoard meeting, budget cycleArchitecture review, cost spikeAlert, Health event, incidentMonthly cadence, audit request
Entry/finops:report/finops:analyze/finops:aws-monthly/finops:aws-monthly --persona=all
TouchpointsEmail summary, Vizro dashboardTrend charts, anomaly flagsCLI output, JSON evidenceFOCUS 1.2+ CSV, full pipeline
Pain Point"Numbers don't match the invoice""Which service is growing fastest?""Which account is overspending?""Are sources cross-validated?"
DelightBoard-ready in 3 min, reconciledMoM delta with root cause26-account coverage, one command99.5% accuracy, 4-way validated
ExitForward email to boardADR for cost optimizationDecommission ticket createdData product published
Time to Value3 minutes30 minutes1 hour1 day
CFO Journey: Board-Ready Cost Report
  1. Step 0: Golden prompt with --persona=executive — Karpathy Rule 1 (confirm scope)
  2. Collect: /finops:aws-monthly + /finops:azure-monthly — automated, READONLY profiles
  3. Report: /finops:report — generates email template with reconciled multi-cloud totals
  4. Distribute: Forward stakeholder email — under 5 min HITL review
  5. Dashboard: /dashboards:generate --source=finops --persona=cfo — Vizro drill-down

Quality gate: Claims cite sources. Daily-normalized MoM uses parse_billing_period_to_days().

CTO Journey: Architecture Cost Analysis
  1. Analyze: /finops:analyze — MoM trends with anomaly flags (>20% change)
  2. Rightsize: /finops:azure-rightsizing — over-provisioned resource detection
  3. Decommission: /finops:decommission-inventory — scream-test scored candidates (E1-E7)
  4. Decide: ADR with cost impact — measured savings, not estimated
  5. Track: /finops:report --persona=technical — CTO-optimized trend view

Quality gate: Every recommendation includes utilization data and cost impact.

CloudOps Journey: Operational Cost Optimization
  1. Discover: /inventory:discover — org-wide resource counts in under 10s
  2. Investigate: /finops:aws-monthly — per-account cost breakdown
  3. Score: /finops:decommission-inventory — activity-based decommission scoring
  4. Act: Create OPS ticket for decommission — HITL approves, CloudOps executes
  5. Validate: /inventory:lz-cross-validate — confirm savings post-action

Quality gate: READONLY only. Savings measured post-decommission, not estimated pre.

FinOps Practitioner Journey: Full Pipeline
  1. Collect: /finops:aws-monthly --persona=all + /finops:azure-monthly — full data
  2. Validate: /devtools:validate + /inventory:lz-cross-validate — 4-way cross-validation
  3. Normalize: FOCUS 1.2+ export — 28-column schema compliance check
  4. Report: All 4 persona views — CFO/CTO/CloudOps/FinOps
  5. Publish: Data product to Confluence or shared storage

Quality gate: 99.5% accuracy across all 4 validation layers. FOCUS schema complete.


Common Mistakes (Anti-Patterns)

MistakeWhy It FailsFix
FINOPS_API_SSOT_MISMATCHAPI-only data misses RBAC-inaccessible subsPortal CSV is SSOT
SELF_COMPARISON_VALIDATIONSame-process JSON+CSV = trivial 0% deltaUse 2+ independent sources
NO_ESTIMATED_COUNTSEstimated numbers in reportsCite measurement method
ADJUSTED_METRIC_EXCLUSIONShrinking denominator to inflate ratesInclude all relevant items
HARDCODED_ENV_IN_PRODUCT_DOCSAWS account IDs in product docsUse env vars and generic terms
DRYRUN_OVER_READONLYUsing --dry-run when READONLY profiles existReal execution with READONLY
HARDCODED_DAILY_DIVISORtotal / 30 ignores Feb (28d), Mar (31d) — creates 6.7% CFO-facing errorUse parse_billing_period_to_days() from runbooks.finops.azure.types

Rollout Lessons Learned

Real incidents — not theory

These lessons come from production rollout failures documented in the anti-patterns catalog and memory. Each has a root cause and fix.

LessonRoot CauseFixAnti-Pattern
Portal CSV is SSOT, not APIAzure RBAC-scoped API misses ungoverned subscriptionsCompare API vs EA CSV; document gap when API total is lowerFINOPS_API_SSOT_MISMATCH
Evidence is not a deliverableJSON files produced but HITL called them "not professional"CxO brief with exec summary, owners, timelines; publishEVIDENCE_NOT_DELIVERABLE
Cross-validation needs independent sourcesSame-process JSON+CSV = 0% delta (serialization test)Use 2+ independent API calls from different data sourcesSELF_COMPARISON_VALIDATION
Docker-first prevents failuresHost missing deps; commands fail silentlydocker compose exec for all tooling; retry via containerBARE_METAL_DOCUSAURUS
Step 0 golden prompt is mandatoryAgent assumed persona and period without askingAskUserQuestion before execution; confirm scope with HITLHIDDEN_ASSUMPTION
READONLY means execute, not deferAgent handed READONLY queries to HITLProfiles provided = pre-authorized; execute immediatelyREADONLY_HITL_HANDOFF
L3 MCP caveat (no AWS CE MCP)MCP layer unavailable as of March 2026L1 CLI (runbooks finops) is authoritative; MCP supplementsDRYRUN_OVER_READONLY
Code before evidence3 PDCA rounds produced evidence but 0 lines of codeShip code first; evidence is a byproduct, not a substitutePROCESS_WITHOUT_OUTCOME

The Golden Prompt

"I want to [generate the monthly FinOps report for AWS and Azure] to [give the CFO board-ready cost data with validated multi-cloud totals]. Read my folder. Ask me questions using AskUserQuestion before you start."

The enterprise team reads READONLY profiles from environment, queries Cost Explorer and Azure Billing API, validates against portal CSV (SSOT), and produces FOCUS 1.2+ normalized output before presenting to HITL for distribution.

2-Way Sync

This golden path is the source of truth for FinOps process documentation. Commands in .claude/commands/finops/ reference this page. For command-level implementation detail, see the 8 command files directly. Live docs: adlc.oceansoft.io/docs/golden-paths/finops-analytics-lifecycle.


FOCUS 1.2+ Normalization

What: FinOps Open Cost and Usage Specification — the vendor-neutral schema for cloud cost data (28 required columns).

Why: AWS, Azure, and GCP each export cost data in different schemas. FOCUS normalizes so reports are comparable and tools are interoperable.

FOCUS 1.2+ Schema (28 columns)
CategoryColumns
IdentityBilledAccountId, SubAccountId, BillingAccountId
TimeBillingPeriodStart, BillingPeriodEnd, ChargePeriodStart, ChargePeriodEnd
CostBilledCost, EffectiveCost, ListCost, ContractedCost
ResourceResourceId, ResourceName, ResourceType, Region, AvailabilityZone
ServiceServiceCategory, ServiceName, SubServiceName
ProviderProviderName, PublisherName, InvoiceIssuerName
TagsTags (key-value map)
ChargeChargeType, ChargeDescription, ChargeFrequency
PricingPricingUnit, PricingQuantity, UsageUnit, UsageQuantity

Schema check: /finops:aws-monthly --validate-focus | /finops:azure-monthly --validate-focus

Anti-pattern prevented: FOCUS_VERSION_HALLUCINATION — grep-verify column presence before claiming compliance.


Multi-Cloud Flow

AWS + Azure side-by-side — each cloud uses its own READONLY profile; portal CSV is SSOT for both. All profiles are READONLY — execution is autonomous when profiles are provided (READONLY_HITL_HANDOFF prevented).

Multi-Cloud Profile Configuration

AWS Profiles (Multi-Account Landing Zone)

ProfilePurposeScope
$AWS_BILLING_PROFILECost Explorer (all accounts)Org-wide billing aggregation
$AWS_MANAGEMENT_PROFILEConfig Aggregator, Resource ExplorerOrg-wide resource inventory
$AWS_OPERATIONS_PROFILEPer-account operational queriesWorkload account details
/finops:aws-monthly --profile=$AWS_BILLING_PROFILE
az account list # MANDATORY before assuming single subscription
/finops:azure-monthly --subscription=$AZURE_SUBSCRIPTION_ID

Azure Configuration

ConfigValuePurpose
$AZURE_SUBSCRIPTION_IDSubscription GUIDPrimary subscription scope
$AZURE_TENANT_IDTenant GUIDTenant-level auth
$AZURE_MONTHLY_BUDGET_NZDBudget thresholdAlert fires at budget x (1 + $AZURE_OVERAGE_THRESHOLD_PCT%)

FINOPS_API_SSOT_MISMATCH prevention: When Azure API total < portal total, document the gap with subscription reconciliation — never present API-only totals as authoritative without the RBAC-scoped caveat.

4-Way Cross-Validation Flow

1. AWS Cost Explorer API    → {$X}
2. AWS Management Console → {$Y}
3. Azure Billing API → {$A}
4. Azure EA Portal CSV → {$B}

Tolerance: |X-Y| ≤ 0.5% (AWS)
|A-B| ≤ 0.5% (Azure)
|A-B| > 0.5% → FINOPS_API_SSOT_MISMATCH documented

Vizro Dashboard Integration

What: Interactive multi-cloud FinOps dashboard built on Vizro (Plotly-based, Python-native). Run: /dashboards:generate --source=finops --persona=cfo

Vizro Dashboard Details (Layout, MCP, Panels)

Pipeline: Reads FOCUS 1.2+ data from /finops:aws-monthly + /finops:azure-monthly → generates persona-mode layouts → exports to tmp/runbooks/finops/dashboard-{date}.html

CFO Mode Layout

PanelData SourceChart Type
Total Cloud SpendAWS CE + Azure EAKPI card with MoM delta
Spend by ServiceAWS + Azure aggregatedBar chart (FOCUS ServiceCategory)
Budget vs Actual$AZURE_MONTHLY_BUDGET_NZDGauge chart with alert threshold
Top 10 Cost DriversBoth cloudsHorizontal bar, sorted by BilledCost
Trend (12 months)Historical FOCUS dataLine chart with anomaly markers
Decommission Candidatesscream-test scoresTable with E1-E7/S1-S7 signals

vizro-analytics MCP: generate_dashboard (create), export_html (distribute), validate_focus_schema (verify 28 columns).


Regulatory Compliance Context

Regulatory Compliance (APRA CPS 234, CPS 230, NIST CSF 2.0)

APRA CPS 234 — Cost Data Security: Billing data classified "internal". READONLY profiles enforce least-privilege. Cost Explorer responses cached in tmp/ (ephemeral). FOCUS-normalized CSVs contain no PII. All API calls use TLS 1.2+.

APRA CPS 230 — Pipeline Resilience: 48-hour tolerance for monthly reports. Portal CSV as SSOT fallback when API unavailable. 4-Way Validation v5.0.0 exceeds CPS 230 "reasonable assurance" threshold at 99.5% accuracy.

NIST CSF 2.0 — Cost Governance

CSF FunctionFinOps Mapping
GOVERN (GV)FOCUS 1.2+ tag compliance via validate-finops-params.sh
IDENTIFY (ID)Config Aggregator identifies all cost-generating resources
PROTECT (PR)READONLY profiles prevent cost mutations
DETECT (DE)/finops:azure-anomaly-detect detects spending anomalies
RESPOND (RS)Persona-based reports enable rapid cost response

Component Audit

Audit scope

Maps declared vs actually-invoked components across all /finops:* commands. Verified against command frontmatter requires_agents and skills fields. Last audited: April 2026.

ComponentTypePhaseStatusNotes
/finops:aws-monthly v5.0.0CommandCollectACTIVE1,890 LOC, 17 skills, golden-prompt, 4-way validation. Without this: no automated AWS cost collection or FOCUS normalization
/finops:azure-monthly v5.0.0CommandCollectACTIVE1,206 LOC, 16 skills, EA CSV as SSOT. Without this: Azure costs invisible to CxO reporting
/finops:report v2.0.0CommandReportACTIVE1,156 LOC, multi-cloud aggregation, 4 persona modes. Without this: no consolidated multi-cloud cost view for CxO
/finops:decommission-inventory v1.1.0CommandAnalyzeACTIVE1,309 LOC, parallel DAG, scream-test scoring. Without this: orphaned resources accumulate silently
/finops:analyze v1.0.0CommandAnalyzeLIGHTWEIGHT286 LOC, needs skill wiring upgrade. Without this: cost trends require manual spreadsheet work
/finops:azure-anomaly-detect v1.0.0CommandAnalyzeSTUB266 LOC, template-ready; needs Azure Cost Management API wiring to activate
/finops:azure-rightsizing v1.0.0CommandOptimizeSTUB318 LOC, template-ready; needs Azure Advisor integration to activate
/finops:azure-validate v1.0.0CommandValidateACTIVE317 LOC, validation specialist. Without this: Azure MCP accuracy unverified
finops-baseSkillAllACTIVE609 LOC, foundation: evidence, PDCA, quality gates. Without this: no shared evidence or PDCA patterns across commands
cross-validationSkillValidateACTIVE564 LOC, 4-way validation engine. Without this: single-source cost data accepted without verification
auth-preflightSkillCollectACTIVE428 LOC, SSO + profile validation. Without this: commands fail mid-execution on expired SSO
quality-gatesSkillAllACTIVE330 LOC, G0-G12 gate definitions. Without this: no standardized pass/fail criteria for reports
accuracy-calculationSkillValidateACTIVE339 LOC, 99.5% threshold math. Without this: accuracy claims have no mathematical basis
awslabs-cost-explorer v0.0.21MCPCollectACTIVECLI is primary; MCP supplements interactive queries. Without this: no programmatic access to Cost Explorer data
azure-cost-management (beta)MCPCollectACTIVERBAC-scoped — not SSOT; portal CSV is SSOT. Without this: no programmatic access to Azure cost data
vizro-analytics v0.1.4MCPReportACTIVEChart generation from FOCUS data. Without this: manual chart creation for every report
validate-finops-params.shHookCollectADVISORYPhase 1: warns on missing billing period, does not block. Without this: commands run with wrong billing period silently
validate-aws-profile-routing.shHookCollectBLOCKINGPrevents profile semantic mismatch. Without this: wrong AWS profile produces silent zero-results
detect-hardcoded-env-data.shHookAllBLOCKINGBlocks account IDs in product docs. Without this: account IDs leak into public documentation
enforce-coordination.shHookAllBLOCKINGPO+CA required before FinOps operations. Without this: agents bypass PO+CA review on cost operations
coding-discipline.md (Karpathy)RuleAllWIREDRules-layer (loaded every session), not in skill frontmatter. Without this: over-engineering and drive-by refactoring go unchecked
cloud-architectAgentAllACTIVETechnical design + risk assessment for all commands. Without this: no technical risk assessment before cost operations
product-ownerAgentAllACTIVERequirements validation + accuracy targets. Without this: no business value validation on cost reports
qa-engineerAgentReportACTIVECxO report scoring (Phase 9/11). Without this: CxO reports shipped without quality scoring
finops-engineerAgentGAPReferenced in docs but NOT in command requires_agents. Impact: doc references a non-wired agent — CxO reads misleading ownership

Agent Team

AgentRole in This PathPhase/StageTalent Bench
finops-engineerCost analysis + spend optimization recommendationsCollect/Analyze/OptimizeProfile
python-engineerCLI execution engine for all runbooks finops commandsAll phasesProfile
cloud-architectCost architecture review + multi-cloud strategyReport/OptimizeProfile
product-ownerBudget priorities + business case validationOptimize/TargetProfile
observability-engineerCost monitoring dashboards + DORA FinOps metricsTrack/MonitorProfile

7 Skills Coverage

SkillCoverage in This PathImplementation
S1 System Design5-phase FinOps cycle (Collect→Validate→Analyze→Report→Optimize), FOCUS 1.2+ data standardPipeline architecture, phase gating, data harmonization
S2 Tool Designrunbooks CLI typed inputs + FOCUS 1.2+ schema enforcement + Cost Explorer API contractTool integration, parameter validation, error specificity
S3 RetrievalAWS Cost Explorer + Azure Cost Management APIs + FOCUS-normalized CSVMulti-cloud cost retrieval, API pagination (Rule 6), org-wide discovery patterns
S4 ReliabilityAPI pagination (Cost Explorer >100 items per page) + retry config + query timeout enforcementPagination loops, exponential backoff, SLO enforcement per operational-efficiency.md Rule 6
S5 SecurityREADONLY profiles only ($AWS_BILLING_PROFILE, $AZURE_SUBSCRIPTION_ID read-only)Access control, no mutations, environment-based authorization
S6 Evaluation4-way cost cross-validation (Cost Explorer API vs FOCUS CSV vs runbooks CLI vs Azure portal)Independent data sources, ≥99.5% accuracy threshold, discrepancy investigation
S7 Product ThinkingCFO-focused cost reports (spend trend, savings recommendations, budget vs actual) + CTO-focused optimization (unit economics, cloud waste, commitment utilization)Persona-specific framing, business impact in $, action owners + timelines

Last Updated: April 2026 | Status: Active | Maintenance: cloud-architect + product-owner