Incident ticket analytics for IT application management services
Ta-Hsin Li, Rong Liu, et al.
SCC 2014
IT incident management aims to restore normal service quality and availability of IT systems from interruptions. IT incidents often have complicated causes aggregated from an IT environment composed of thousands of interdependent components. Incident diagnosis then requires collecting and analyzing a large scale of data regarding these components, often, in real time to find suspect causes. It is extremely difficult to fulfill this requirement using traditional techniques. In this paper, we propose a new analysis architecture using Big Data techniques. This architecture leverages stream computing and MapReduce techniques to analyze data from various data sources, uses NoSQL databases to store incident-related documents and their relationships, and further utilizes other analytical techniques to examine the documents for root causes and failure prediction. We demonstrate this approach using a real-world example and present evaluation results from a recent pilot study.
Ta-Hsin Li, Rong Liu, et al.
SCC 2014
Feng Li, Hao Chen, et al.
SOLI 2014
Chunhua Tian, Hao Zhang, et al.
SOLI 2008
Chunhua Tian, Feng Li, et al.
SOLI 2010