Skip to content

Component Health#

Components can be deployed anywhere in your data centers or clouds, making it important to have real-time indicators (status, availability, health) for all of them.

From the Unryo Connect Console, users can: * see the status and health for all components, * be notified if the component is experiencing an availability issue or a performance issue, * get a description of the detected issue.

A colored severity marker appear on components icons in the Topology View and the GeoMap View.

In the Overview panel, information appears in the Availability and Performance columns.

Security: Components send health information to the Unryo Connect Service within the Unryo ping message, so the communication is always outbound.

Severity Classification#

A component can have five severity states:

Severity Message Color Icon Meaning
5 OK green check_circle No issue found.
4 UNKNOWN cyan blue help unknown issue found.
3 MINOR yellow warning high utilization issue found.
2 MAJOR orange error full capacity issue found.
1 CRITICAL red highlight_off service is down.

Understanding Most Common Availability Problems#

Availability problems indicate that the service is not working. Meaning that the component is either down or unable to communicate with the Unryo Connect Service and its peers.

Problem Severity Message
Connection refused on Unryo Broker 1 CRITICAL
Connection error on Unryo Broker 1 CRITICAL

Understanding Most Common Performance Problems#

Performance problems indicates that the service is still working, but is either degraded, or at-risk.

Problem Severity Message
Storage Disk Almost Full 1 Critical
Storage Disk Full 1 Critical
High Swap Utilization 2 Major
High Memory High CPU 3 Minor
High Memory Utilization 3 Minor

Creating your own monitoring checks#

Unryo is open and extensible. To complement the predefined monitoring checks, you can add your own scripts or binaries in a specific components folder.

Issue properties#

All issues need to contain these attributes: - Category=String - Whether is a Availability or Performance issue. - Description=String - Human readable estimation of the module health. - Severity=1-5 - Issue severity - Weight=0-100 - Tie breaker when issues have the same severity. (0 for low, 100 for high) - Availability: Represent the degradation score. - Performance: 0 for low, 100 for high. Representing the load.

Script Format#

All scripts/exectuables must follow a filename format. The format is that the filename must have a prefix that is health-<category>-. Which is a health- prefix with the category that the script is associated with.

Note: Once the category is loaded, the category will be capitalized. Which means, availability will be Availability like our issues object examples.

You need to ensure that every script has the executable permission (+x).

Output format#

In order to share issues, your script should print a single line for each issue followting the CSV specification. All of the issues are communicated via the standard output stdout and are comply with the following line protocol:

<severity>,<weight>,<description>

Script Example health-availability-sample:

#!/bin/bash
printf "1,100,This is a sample issue\n"

How the output can look:

1,100,This is a sample issue
1,100,Connection error on Unryo Broker
3,25,"This description is quoted because it contains a , char"
3,25,"CSV uses a "" character to quote "" character"
...

Here is the current set of health attributes:

You can use it as a reference to evaluate your issues.

Weight (0-100) Issue Severity (1-5) Issue Short visual description (less than 15 chars)
0 5 No Unryo connections requested
25 4 Storage Disk Almost Full
25 3 High CPU Utlization
50 3 High Memory Utilization
100 1 Storage Disk Full
100 2 High Swap Utilization
100 1 Connection refused on Unryo Broker
100 1 Connection error on
---

Add new health-check to an Unryo Node#

Once your script/executable is ready, the only thing you have to do is to copy the file into the /health folder that is present in every Unryo Node file system.

docker cp health-availability-example.sh <container-id>:/health
docker cp health-performance-example <container-id>:/health