A supervised learning model for identifying inactive VMs in private cloud data centers
Abstract
A private cloud has become an essential computing infrastructure for many enterprises. However, according to a recent study, 30% of VMs in data centers are not being used for any productive work. These "inactive" (or "zombie") VMs can arise from faulty VM management code within the cloud infrastructure but are usually the result of human neglect. Inactive VMs can hurt the performance of productive VMs, can distort internal cost management, and in the extreme can result in the cloud infrastructure being unable to allocate resources for new VMs. Correctly assessing the productivity of a VM can be challenging: e.g., is a VM that has low CPU utilization being used to slowly edit source code or is it an inactive VM that happens to be performing routine maintenance (e.g., virus-scan and software updates)? To address this problem, we develop a supervised learning model that leverages primitive information (e.g., running process, login history, network connections) of VMs periodically collected by a lightweight data collection framework. This model employs a linear support vector machine (SVM) approach that reflects single VM behavior as well as coordinated VM behaviors. We evaluated the identification accuracy of this model with a real-world dataset within IBM of more than 750 VMs. Results show that our model has a 20% higher accuracy (90%) than state-of-the-art approaches. An accurate model is an important first step to enable private cloud infrastructures to achieve better resource management through such actions as suspending or dynamically downsizing inactive VMs.