Skip to main content

Cost Optimization Tier Classification

Extracted from CloudOps-Runbooks cost optimization work. The specific dollar figures are project-dependent — the METHODOLOGY (tier classification + prioritisation matrix) is reusable.

Purpose

When facing dozens of potential cost optimization scenarios, you need a systematic way to prioritise. This tier classification separates scenarios by implementation effort and validates them through cross-checking before committing resources.

Three-Tier Classification

Tier 1: Quick Wins

AttributeCharacteristic
Implementation time1–2 days per scenario
ComplexityLow — single-service, single-account
RiskMinimal — unused/orphaned resources
ValidationAutomated discovery + manual confirmation
ExamplesUnused NAT Gateways, unattached Elastic IPs, orphaned EBS volumes, idle load balancers

Prioritisation criteria: Start with resources that have zero utilisation (easiest to justify), then move to low-utilisation resources.

Tier 2: Strategic Optimisations

AttributeCharacteristic
Implementation time1–2 weeks per scenario
ComplexityMedium — requires capacity planning or commitment decisions
RiskModerate — requires workload analysis before commitment
ValidationHistorical usage analysis + forecasting
ExamplesReserved Instance purchases, Savings Plans, right-sizing, container optimisation

Prioritisation criteria: Highest utilisation-to-commitment ratio first. Validate usage patterns across 30–90 days before committing.

Tier 3: Transformational Changes

AttributeCharacteristic
Implementation time2–8 weeks per scenario
ComplexityHigh — multi-account, cross-service, requires architecture changes
RiskSignificant — involves service migrations or architectural redesign
ValidationProof-of-concept + staged rollout
ExamplesMulti-account governance, enterprise RI portfolio strategy, workload migration, architecture modernisation

Prioritisation criteria: Business case approval required. Proof-of-concept before full implementation.

Prioritisation Matrix

For each identified scenario, score on two axes:

AxisLow (1)Medium (2)High (3)
ImpactMarginal savingsMeaningful savingsTransformational savings
EffortDaysWeeksMonths

Then prioritise:

PriorityCriteriaAction
P1High impact, Low effortExecute immediately (Tier 1)
P2High impact, Medium effortPlan and schedule (Tier 2)
P3Medium impact, Low effortBatch with other quick wins
P4High impact, High effortBusiness case required (Tier 3)
P5Low impact, any effortBacklog — revisit quarterly

Validation Framework

Before executing any optimisation:

For Tier 1 (Quick Wins)

  1. Discovery: Automated scan identifies candidate resources
  2. Confirmation: Validate resource is truly unused (check CloudTrail, flow logs, access patterns)
  3. Safety: Tag for decommission, wait N days (scream test), then remove
  4. Evidence: Log before/after cost in evidence directory

For Tier 2 (Strategic)

  1. Usage analysis: 30–90 day utilisation data
  2. Forecast: Project future usage based on business growth plans
  3. Commitment modelling: Compare on-demand vs reserved vs savings plan costs
  4. Approval: Finance/procurement review for commitment purchases
  5. Evidence: Model assumptions + actual vs projected tracking

For Tier 3 (Transformational)

  1. Business case: ROI model with conservative estimates
  2. Proof of concept: Validate approach on non-production workload
  3. Staged rollout: One account/service at a time
  4. Executive approval: Board-level for significant commitments
  5. Evidence: Stage gate reviews at each milestone

Implementation Phases

PhaseDurationFocusGate
FoundationWeeks 1–2Tooling, discovery, baselineBaseline established
Quick WinsWeeks 3–6Tier 1 scenariosValidated savings
StrategicWeeks 7–14Tier 2 scenariosCommitment decisions approved
TransformationalWeeks 15–30Tier 3 scenariosArchitecture changes validated

Anti-Patterns

Anti-PatternWhy It Fails
Projecting total savings across all scenarios simultaneouslyCreates unrealistic expectations; scenarios have dependencies
Committing to RIs without usage analysisOver-commitment if workloads change
Skipping scream test for "obviously unused" resourcesResources that look unused may have infrequent but critical access patterns
Reporting projected savings as achieved savingsProjection ≠ realisation; only report what's been measured post-implementation

Origin: CloudOps-Runbooks 28-scenario cost optimisation classification. Dollar figures removed — methodology is reusable across consumer projects with their own validated numbers.