A holistic system software integration of disaggregated memory for next-generation cloud infrastructures
Abstract
Modern cloud computing workloads are becoming day by day more demanding, in terms of computational resources, as they feature multiple complex components, utilize heterogeneous hardware, and require tremendous amounts of memory. Such attributes of the emerging workloads disrupt the traditional design of cloud infrastructures, which are bound to decisions at the design time of the infrastructure, and mandate the dynamic composability of the next-generation cloud infrastructures.Scaling beyond the physical boundaries of the server trays, and minimizing over-provisioning of the nodes by disaggregating computational resources, such as CPUs, memory, etc., is a timely research problem. However, the prior line of research focuses on either NVM technologies or other disk-related approaches that try to relax the resource pressure induced by over-provisioning.In this work, we design, implement, and evaluate all necessary changes in the system software stack to support the dynamic allocation of disaggregated memory, depending on the needs of the workloads at hand. As a result, we can transparently increase the available memory of a system and achieve on average 57% better performance, compared to Swap, the de-facto mechanism used today for over-provisioning. Furthermore, we present a memory balancing policy that autonomously migrates memory pages across local and disaggregated memory. Our experimental results show that, even with a limited amount of local memory, memory balancing improves the performance by 41% on average, compared with using only disaggregated memory.