Towards complete dis-aggregation of data center rack power using light-weight mechanisms
- Kalyan Dasgupta
- Umamaheswari Devi
- et al.
- 2022
- CLOUD 2022
Enterprises are heavily dependent on Information Technology for digitizing and automating their operations. Much of these enterprise IT workloads are either already deployed on the public cloud or private data centers or expected to migrate to a data center in the near future. The estimated electricity consumption of data centers is of the order of 200 terawatt-hours, which is approximately 1% of the global electricity consumption. Although the grim energy predictions of the past for data centers have not come to bear, with the increase in AI-powered workloads and other trends, ensuring that the energy consumption at data centers still remains contained requires continued and significant investments to identify and eliminate inefficiencies at its various elements and operations that consume and hog power.
Simultaneously, in an effort to tackle climate change, governments across the globe are mandating that enterprises report the carbon emissions from all their operations, including that from their computing workloads, both on-premise and on cloud, and act to reduce the same. To address this requirement, enterprises are turning to data center and cloud operators as well as third-party tools and service providers to assess their current emissions and evaluate optimization options.
At IBM Research, we address the above problem using an ambitious, comprehensive, and multi-pronged strategy that includes carbon quantification for tenants and workloads on IBM Cloud and on-premise data centers, AI-infused sustainability transformations for enterprise customers, and multi-disciplinary sustainable computing research spanning the areas of multi-cluster infrastructure, hardware systems, platform, software, and AIOps to improve manufacturing processes, design specialized hardware and cooling systems, build carbon-aware software solutions, and develop innovative run-time algorithms to manage systems and software to mitigate environmental cost over the complete lifecycle.
Our near-term effort spans the three major phases of quantification, analysis or assessment, and optimization/remediation of carbon emissions in data centers and cloud to be performed cyclically in consultation with application owners at appropriate time granularity.
The overall architecture is provided below:
Within the above overall framework, we explore problems in the areas of:
Through our efforts, we aspire to infuse needed observability through the entirety of computing hardware and software stack and use it to build carbon performance monitoring and management solutions akin to those in the application performance monitoring and management space.