Dynamic Alert Suppression Policy for Noise Reduction in AIOps
Abstract
As IT environments evolve in both size and complexity, observability tools are needed to monitor their health. As the anomalous events are detected, alerts are generated, leading to alert notifications to the Site Reliability Engineers(SREs). However, most of these notifications turn out to be false alarms, leading to alert fatigue, and inefficiencies. Existing approaches for reducing alert noise rely on static policies that can quickly become outdated in dynamic IT environments and are therefore difficult to maintain. In this work, we propose a novel unsupervised approach, Dynamic-X-Y, guided by a well known moving average envelope statistical method, to learn custom tailored alert suppression policy from historical alerts and events data. At run-time, these learned policies are applied to incoming events/alerts to reduce false alert notifications. We validate our approach on two different datasets, log anomaly and metric anomaly events/alerts, to show percentage increase in accuracy over state-of-the-art methods by $7.39%$ and $35.7%$, respectively.