Root Cause Detection using Dynamic Dependency Graphs from Time Series Data
Abstract
Change detection in system behavior and its root cause detection is essential for many large-scale systems such as, manufacturing plants, in order to keep systems running uninterrupted and avoid costly machine breakdown via predictive maintenance. In this paper, we present a novel graph based technique that uses time variant interdependencies and lagged dependencies among different components of a system to detect changes in the system behavior. We further find the root causes for these detected changes by pointing out the component and its historical values that are responsible for initiating and changing the system to the new state. The proposed mechanism extracts these time variant dependencies using a deep learning system and converts them into weighted directed graphs and applies graph based techniques for change detection. For each detected change, our system uses graph theoretic techniques to uncover the root causes for the change. Such a mechanism provides us with valuable insights about the inner workings of a system from a different perspective as opposed to traditional techniques for root cause analysis that directly apply statistical models to the time series data for analysis. Experimental results on real manufacturing data show, that we can detect changes in system behavior and accurately identify the root causes in almost 71% of the cases for which we have the ground truth. For synthetic data, our system can correctly identify root causes in 87% of the cases.