Publication
SAC 2007
Conference paper

Performance problem localization in self-healing, service-oriented systems using Bayesian networks

View publication

Abstract

In distributed, service-oriented environments, performance problem localization is required to provide self-healing capabilities and deliver the desired quality of service (QoS). This paper presents an automated approach to identifying system elements causing performance problems. Applying probabilistic inference to collected response time and elapsed time data, the approach 1) infers elapsed time for services where data is missing, 2) estimates the response time degradation caused by different services using the duration, abnormality and response time correlation of their elapsed times, and 3) identifies the services that are the most important causes of slow response time and yield the most benefit if recovered. The approach has been used to localize a performance problem on the test bed of a real-world service-oriented Grid. Evaluation using simulations shows that the approach consistently achieves better accuracy than traditional techniques in various service-oriented settings. Copyright 2007 ACM.

Date

Publication

SAC 2007

Authors

Share