AI Assistant#
Role of the AI Assistant#
The AI assistant offloads your operation teams by automating the investigation work.
It summarizes, in one view, the situation around the selected alert: all the necessary information, why the alert triggered, the context around it, the correlation results, and prepare troubleshooting steps for the user.
The AI assistant view is dynamic, meaning the information presented is in context to the alarm type or to the domain monitored.
AI widgets#
The AI Assistant view combines a set of widgets. Some widgets are generic, meaning they run for every alerts, and some widgets are focussed to a specific alert situation.
Every widget does a specialized check, and uses either topology, metrics, logs, metadata - to give precise insights:
- surface probable root cause(s),
- estimate the impact on your important services
- propose solutions to remediate.
Situation Analysis#
A clear title and summary for the alarm, with only the level of details needed.
Resource Overview#
Displays details for the resource in problem. It can be generic tags (such as reachability, in_maintenance?, ...) and resource_type specific tags (for exemple CPU% - current, avg last daty), status, number of pods, version, …).
Probable Root-Cause#
Drive-down the topology path to find the probable root-cause.
Impact Analysis#
List all resources impacted based on the topology knowledge.
Historical Analysis#
Check if a recurring problem (occurrences), and provides statistics for those past alerts (alert lifetime, workflow – notification, ticketing, ….).
Metadata Changes#
This widget detects metadata changes over time for the resource in problem, which may give clues for possible root-causes. For example, a system upgrade detected just before the alert is a relevant information for operation teams.
Topology changes#
Summarizes topology changes over time.
Service Affected#
List services impacted, if any. We check if the node in the "Business Service" layer, and we display the impacts.
Events of Interest#
Find related alerts: alerts issued from neighbors (topology proximity), alerts sharing common attributes (attributes similarities) or with same arrival time (time proximity). Regrouping these alerts together improves the visibility.
AI Chat#
Submits alert information to the OpenAI ChatGPT engine, which analyzes it, and gives insights in natural language.
Cloud Provider Status#
Check if the cloud provider is reporting issues.