Executive summary
A prolonged gap in production image deployments meant our ECR lifecycle policy (seven-day retention) expired container images that were still referenced by active ECS task definitions. When ECS later attempted to replace tasks (routine churn) or scale services, new tasks failed to start because the required image digests/tags were no longer available in ECR. This resulted in service disruption and customer-facing downtime.
Detection was delayed because third-party monit...