160 component health
Component Health#
Unryo Components can be deployed anywhere in your data centers or clouds, making it important to have real-time indicators (status, availability, health) for all of them.
From the Unryo Connect Console, users can: * see the status and health for all components, * be notified if the component is experiencing an availability issue or a performance issue, * get a description of the detected issue.
A colored severity marker appear on components icons in the Topology View and the GeoMap View.
In the Overview panel, information appears in the Availability and Performance columns.
Security: Components send health information to the Unryo Connect Service within the Unryo ping message, so the communication is always outbound.
Severity Classification#
A component can have five severity
states:
Severity | Message | Color | Icon | Meaning |
---|---|---|---|---|
5 | OK | green | check_circle | No issue found. |
4 | UNKNOWN | cyan blue | help | unknown issue found. |
3 | MINOR | yellow | warning | high utilization issue found. |
2 | MAJOR | orange | error | full capacity issue found. |
1 | CRITICAL | red | highlight_off | service is down. |
Understanding Most Common Availability Problems#
Availability problems indicate that the service is not working. Meaning that the component is either down or unable to communicate with the Unryo Connect Service and its peers.
Problem | Severity | Message |
---|---|---|
Connection refused on Unryo Broker | 1 | CRITICAL |
Connection error on Unryo Broker | 1 | CRITICAL |
Understanding Most Common Performance Problems#
Performance problems indicates that the service is still working, but is either degraded, or at-risk.
Problem | Severity | Message |
---|---|---|
Storage Disk Almost Full | 1 | Critical |
Storage Disk Full | 1 | Critical |
High Swap Utilization | 2 | Major |
High Memory High CPU | 3 | Minor |
High Memory Utilization | 3 | Minor |
Creating your own monitoring checks#
Unryo is open and extensible. To complement the predefined monitoring checks, you can add your own scripts or binaries in a specific components folder.
Issue properties#
All issues need to contain these attributes:
- Category=String
- Whether is a Availability
or Performance
issue.
- Description=String
- Human readable estimation of the module health.
- Severity=1-5
- Issue severity
- Weight=0-100
- Tie breaker when issues have the same severity. (0 for low, 100 for high)
- Availability: Represent the degradation score.
- Performance: 0 for low, 100 for high. Representing the load.
Script Format#
All scripts/exectuables must follow a filename format. The format is that the filename must have a prefix that is health-<category>-
. Which is a health-
prefix with the category that the script is associated with.
Note: Once the category is loaded, the category will be capitalized. Which means,
availability
will be Availability
like our issues object examples.
You need to ensure that every script has the executable permission (+x
).
Output format#
In order to share issues, your script should print a single line for each
issue followting the
CSV specification. All of the
issues are communicated via the standard output stdout
and are comply with
the following line protocol:
<severity>,<weight>,<description>
Script Example health-availability-sample
:
#!/bin/bash
printf "1,100,This is a sample issue\n"
How the output can look:
1,100,This is a sample issue
1,100,Connection error on Unryo Broker
3,25,"This description is quoted because it contains a , char"
3,25,"CSV uses a "" character to quote "" character"
...
Here is the current set of health attributes:
You can use it as a reference to evaluate your issues.
Weight (0-100) | Issue Severity (1-5) | Issue Short visual description (less than 15 chars) |
---|---|---|
0 | 5 | No Unryo connections requested |
25 | 4 | Storage Disk Almost Full |
25 | 3 | High CPU Utlization |
50 | 3 | High Memory Utilization |
100 | 1 | Storage Disk Full |
100 | 2 | High Swap Utilization |
100 | 1 | Connection refused on Unryo Broker |
100 | 1 | Connection error on |
--- |
Add new health-check to an Unryo Node#
Once your script/executable is ready, the only thing you have to do is to copy the file into the /health
folder that is present in every Unryo Node file system.
docker cp health-availability-example.sh <container-id>:/health
docker cp health-performance-example <container-id>:/health