Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Alert

Trigger

Time to re-trigger

Host down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

20 2 hours

[PRD] ZWE MoH - VMMC

Alert

Trigger

Time to re-trigger

Host down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

20 2 hours

[PRD] ZWE MoH - Monitoring

Alert

Trigger

Time to re-trigger

Host down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

20 2 hours