Alerts

The following page shows the alerts configured for prometheus, the servers and the triggering time.

ZWE MoH - Production Virtual Servers

Alert	Trigger	Severity	Time to re-trigger

Alert	Trigger	Severity	Time to re-trigger
Host down	Metrics can not be retrieved from host	critical	15 minutes
High CPU usage	CPU usage gets above 85%	Medium	15 minutes
Low memory	Machine memory get below 15%	Medium	15 minutes
Disk space low	Machine storage gets below 10%	Medium	1 hour
High disk I/O latency	Machine average input output exceeds 70%	Medium	30 minutes
Network error	Machine average network errors exceds 0 (meaning that something above 0 is error)	Medium	2 hours

Container	Trigger	Severity	Time to re-trigger

Container	Trigger	Severity	Time to re-trigger