Cost Optimization Tier Classification
Extracted from CloudOps-Runbooks cost optimization work. The specific dollar figures are project-dependent — the METHODOLOGY (tier classification + prioritisation matrix) is reusable.
Purpose
When facing dozens of potential cost optimization scenarios, you need a systematic way to prioritise. This tier classification separates scenarios by implementation effort and validates them through cross-checking before committing resources.
Three-Tier Classification
Tier 1: Quick Wins
| Attribute | Characteristic |
|---|---|
| Implementation time | 1–2 days per scenario |
| Complexity | Low — single-service, single-account |
| Risk | Minimal — unused/orphaned resources |
| Validation | Automated discovery + manual confirmation |
| Examples | Unused NAT Gateways, unattached Elastic IPs, orphaned EBS volumes, idle load balancers |
Prioritisation criteria: Start with resources that have zero utilisation (easiest to justify), then move to low-utilisation resources.
Tier 2: Strategic Optimisations
| Attribute | Characteristic |
|---|---|
| Implementation time | 1–2 weeks per scenario |
| Complexity | Medium — requires capacity planning or commitment decisions |
| Risk | Moderate — requires workload analysis before commitment |
| Validation | Historical usage analysis + forecasting |
| Examples | Reserved Instance purchases, Savings Plans, right-sizing, container optimisation |
Prioritisation criteria: Highest utilisation-to-commitment ratio first. Validate usage patterns across 30–90 days before committing.
Tier 3: Transformational Changes
| Attribute | Characteristic |
|---|---|
| Implementation time | 2–8 weeks per scenario |
| Complexity | High — multi-account, cross-service, requires architecture changes |
| Risk | Significant — involves service migrations or architectural redesign |
| Validation | Proof-of-concept + staged rollout |
| Examples | Multi-account governance, enterprise RI portfolio strategy, workload migration, architecture modernisation |
Prioritisation criteria: Business case approval required. Proof-of-concept before full implementation.
Prioritisation Matrix
For each identified scenario, score on two axes:
| Axis | Low (1) | Medium (2) | High (3) |
|---|---|---|---|
| Impact | Marginal savings | Meaningful savings | Transformational savings |
| Effort | Days | Weeks | Months |
Then prioritise:
| Priority | Criteria | Action |
|---|---|---|
| P1 | High impact, Low effort | Execute immediately (Tier 1) |
| P2 | High impact, Medium effort | Plan and schedule (Tier 2) |
| P3 | Medium impact, Low effort | Batch with other quick wins |
| P4 | High impact, High effort | Business case required (Tier 3) |
| P5 | Low impact, any effort | Backlog — revisit quarterly |
Validation Framework
Before executing any optimisation:
For Tier 1 (Quick Wins)
- Discovery: Automated scan identifies candidate resources
- Confirmation: Validate resource is truly unused (check CloudTrail, flow logs, access patterns)
- Safety: Tag for decommission, wait N days (scream test), then remove
- Evidence: Log before/after cost in evidence directory
For Tier 2 (Strategic)
- Usage analysis: 30–90 day utilisation data
- Forecast: Project future usage based on business growth plans
- Commitment modelling: Compare on-demand vs reserved vs savings plan costs
- Approval: Finance/procurement review for commitment purchases
- Evidence: Model assumptions + actual vs projected tracking
For Tier 3 (Transformational)
- Business case: ROI model with conservative estimates
- Proof of concept: Validate approach on non-production workload
- Staged rollout: One account/service at a time
- Executive approval: Board-level for significant commitments
- Evidence: Stage gate reviews at each milestone
Implementation Phases
| Phase | Duration | Focus | Gate |
|---|---|---|---|
| Foundation | Weeks 1–2 | Tooling, discovery, baseline | Baseline established |
| Quick Wins | Weeks 3–6 | Tier 1 scenarios | Validated savings |
| Strategic | Weeks 7–14 | Tier 2 scenarios | Commitment decisions approved |
| Transformational | Weeks 15–30 | Tier 3 scenarios | Architecture changes validated |
Anti-Patterns
| Anti-Pattern | Why It Fails |
|---|---|
| Projecting total savings across all scenarios simultaneously | Creates unrealistic expectations; scenarios have dependencies |
| Committing to RIs without usage analysis | Over-commitment if workloads change |
| Skipping scream test for "obviously unused" resources | Resources that look unused may have infrequent but critical access patterns |
| Reporting projected savings as achieved savings | Projection ≠ realisation; only report what's been measured post-implementation |
Origin: CloudOps-Runbooks 28-scenario cost optimisation classification. Dollar figures removed — methodology is reusable across consumer projects with their own validated numbers.