Skip to main content

aws.incident-triage

Type: commands | Track: Enterprise

Triage AWS incidents — Health events, Security Hub findings, and Cost anomalies — with structured investigation, decommission feasibility scoring, and evidence collection.

Benefits

  1. Structured incident classification routes each incident type to the correct specialist skill (EC2 health → aws-health-event-triage, cost anomaly → cost-explorer-analysis)
  2. Org-wide discovery first — uses Config Aggregator before per-account search, preventing NARROW_SEARCH_SCOPE anti-pattern
  3. Decommission feasibility scoring (S1-S5 scream test) before any maintenance preparation
  4. Complete evidence package with HITL summary and change management requirements

When to Use

AttributeDetail
PersonaSRE / CloudOps Engineer
TriggerAWS Health event received, Security Hub finding escalated, or a Cost anomaly alert fires — any unplanned event requiring structured investigation with evidence before raising a change request
Business ValueStructured triage with evidence collection — prevents REBOOT_FIRST_DECOMMISSION_SECOND (treating maintenance as the fix when decommission is correct) and TECHNICAL_WITHOUT_PROCESS (specifying what to do without how to get approval)
FrequencyOn-demand

Example: As an SRE, I need to triage an AWS Health event for an EC2 instance scheduled for host maintenance because the MSP cannot modify environments without a change order, and I need to determine whether to prepare for reboot or recommend decommission instead. I run /aws:incident-triage which classifies the incident, runs org-wide discovery to correlate the affected resource with its VPC/account context, scores decommission feasibility, and produces a HITL summary with the recommended action and change management steps.

Enterprise-only. Contact sales for licensing details.