Skip to main content

Principle V: Observability & Resilience

Source: .specify/memory/constitution.md

Overview

Enterprise agents must be continuously observed and managed to maintain reliability, performance, and operational excellence. Agents are adaptive, non-deterministic systems that require fundamentally different observability than traditional applications.

The paradigm shifts from "is it up?" to "is it right?" — where incorrect, biased, or hallucinated outputs pose operational and security risks even when systems are technically performant.

Non-Negotiable Rules

RuleDescription
MELT TelemetryComplete coverage of Metrics, Events, Logs, and Traces
Agent ObservabilityReasoning traces, tool calls, token usage, and cost tracking
Real-time MonitoringQuality, safety, and operations metrics dashboards
Drift DetectionAnomaly identification with automated alerting
SLOs + Error BudgetsDefined service level objectives with incident runbooks
Root Cause AnalysisCorrelate failures to prompts, tools, and models

MELT Framework

PDCA Cycle Tracking

The enforce-pdca-cycle.sh hook tracks Plan-Do-Check-Act iterations per specialist agent per session. This prevents infinite retry loops and ensures HITL escalation when an agent cannot converge.

PropertyValue
Default cycle limit7 (ADLC_MAX_PDCA_CYCLES)
State filetmp/<project>/pdca-cycles/<agent>-YYYY-MM-DD.json
EscalationHITL warning via stderr at cycle limit
ScopePer specialist, per day (coordination agents excluded)

Evidence Audit Trail

The log-coordination-wrapper.sh hook auto-logs ALL agent completions (both coordination and specialist) to structured JSON files. This provides a complete audit trail of every agent invocation.

Agent ClassAgreement ScoreLog Path
Coordination (PO, CA)97% (design-level)coordination-logs/<agent>-YYYY-MM-DD.json
Specialist100% (binary done/not)coordination-logs/<agent>-YYYY-MM-DD.json

See Hook Enforcement Reference for the complete log schema and PDCA state machine details.

Reference