The incident lasted 0.5 hours from discovery to remediation.
No customers were impacted.
The instance was re-started.
Implemented strict monitoring for the impacted services, alerting the DevOps & Development teams immediately, with automatic re-start of the impacted service,
In addition, before triggering any background jobs, a verification is done to test if the service is running, and pause the process until the service has re-started.