Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
minLevel1
maxLevel6
outlinefalse
stylenone
typelist
printabletrue

...

ZWE MoH -

...

Production Virtual Servers

Alert

Trigger

Time to re-trigger

Host down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Alert

Trigger

Severity

Time to re-trigger

Host down

Metrics can not be retrieved from host

Status
colourRed
titlecritical

15 minutes

High CPU usage

CPU usage gets above 85%2 hours

Status
colourYellow
titleMedium

15 minutes

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

2 hours

[PRD] ZWE MoH - eLearning

Status
colourYellow
titleMedium

15 minutes

Disk space low

Machine storage gets below 10%

2 hours

Status
colourYellow
titleMedium

1 hour

High disk I/O latency

Machine average input output exceeds 70%

Status
colourYellow
titleMedium

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

Status
colourYellow
titleMedium

2 hours

...

VMMC MoH -

...

Production Virtual Servers

Alert

Container

Trigger

Severity

Time to re-trigger

Host down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

2 hours

[PRD] ZWE MoH - Monitoring

Alert

Trigger

Tomcat: DWS/ WFA

container down

Status
colourRed
titlecritical

5 minutes

Mongo

Kafka

Zoo keeper

Node.js

e-Learning MoH - Production Virtual Servers

Container

Trigger

Severity

Time to re-trigger

Host

Postgres/ Moodle

container down

Metrics can not be retrieved from host

15 minutes

High CPU usage

CPU usage gets above 85%

2 hours

Low memory

Machine memory get below 15%

2 hours

Disk space low

Machine storage gets below 10%

2 hours

High disk I/O latency

Machine average input output exceeds 70%

30 minutes

Network error

Machine average network errors exceds 0 (meaning that something above 0 is error)

2 hours

Status
colourRed
titlecritical

5 minutes

Moodle

Postgres/ Warehouse

NiFi

Superset