Future Workload and Cloud Resource Usage: Insights from an Interpretable Forecasting Model
Abstract
The emergence of proactive autoscaling aims to guarantee the Quality of Service (QoS) in accordance with the Service Level Agreement (SLA) between cloud providers and users, while promoting efficient resource utilization and minimizing operational costs. Despite its benefits, proactive autoscaling remains complex due to challenges in accurately forecasting resource needs amid workload fluctuations and implementing effective actions based on these forecasts. To address these issues, we propose a mechanism for forecasting workload and cloud resources using a variant of the Transformer. This multivariate, multi–horizon forecasting approach provides both forecasts and insights into the significance of the features associated with the forecasting results, enabling time–granular autoscaling. Through experiments with real–world data, we demonstrate that, rather than first forecasting workload and then estimating resource usage, we can directly forecast resource usage. This method yields the same conclusions regarding the feature importance in workload and resource forecasting, thereby simplifying the existing autoscaling approaches.