/
Alerts
  • In progress
  • Alerts

    The following page shows the alerts configured for prometheus, the servers and the triggering time.

    ZWE MoH - Production Virtual Servers

    Alert

    Trigger

    Severity

    Time to re-trigger

    Alert

    Trigger

    Severity

    Time to re-trigger

    Host down

    Metrics can not be retrieved from host

    critical

    15 minutes

    High CPU usage

    CPU usage gets above 85%

    Medium

    15 minutes

    Low memory

    Machine memory get below 15%

    Medium

    15 minutes

    Disk space low

    Machine storage gets below 10%

    Medium

    1 hour

    High disk I/O latency

    Machine average input output exceeds 70%

    Medium

    30 minutes

    Network error

    Machine average network errors exceds 0 (meaning that something above 0 is error)

    Medium

    2 hours

    VMMC MoH - Production Virtual Servers

    Container

    Trigger

    Severity

    Time to re-trigger

    Container

    Trigger

    Severity

    Time to re-trigger

    Tomcat: DWS/ WFA

    container down

    critical

    5 minutes

    Mongo

    Kafka

    Zoo keeper

    Node.js

    e-Learning MoH - Production Virtual Servers

    Container

    Trigger

    Severity

    Time to re-trigger

    Container

    Trigger

    Severity

    Time to re-trigger

    Postgres/ Moodle

    container down

    critical

    5 minutes

    Moodle

    Postgres/ Warehouse

    NiFi

    Superset

     

    Related content