Active memory cube: A processing-in-memory architecture for exascale systems

Ravi Nair; Samuel F. Antao; Carlo Bertolli; Pradip Bose; Jose Brunheroto; Tong Chen; Chen-Yong Cher; Carlos H.A. Costa; Jun Doi; Constantinos Evangelinos; Bruce Fleischer; Thomas Fox; Diego S. Gallo; Leopold Grinberg; John Gunnels; Arpith Chacko Jacob; P. Jacob; Hans Jacobson; T. Karkhanis; Changhoan Kim; Jaime Moreno; Kevin O&#039;Brien; M. Ohmacht; Yoonho Park; Daniel A. Prener; Bryan S. Rosenburg; Kyung Dong Ryu; Olivier Sallenave; Mauricio Serrano; P.D.M. Siegl; Krishnan Sugavanam; Zehra Sura

doi:10.1147/JRD.2015.2409732

IBM J. Res. Dev

Paper

01 Mar 2015

Active memory cube: A processing-in-memory architecture for exascale systems

View publication

Abstract

Many studies point to the difficulty of scaling existing computer architectures to meet the needs of an exascale system (i.e., capable of executing $10^{18}$ floating-point operations per second), consuming no more than 20 MW in power, by around the year 2020. This paper outlines a new architecture, the Active Memory Cube, which reduces the energy of computation significantly by performing computation in the memory module, rather than moving data through large memory hierarchies to the processor core. The architecture leverages a commercially demonstrated 3D memory stack called the Hybrid Memory Cube, placing sophisticated computational elements on the logic layer below its stack of dynamic random-access memory (DRAM) dies. The paper also describes an Active Memory Cube tuned to the requirements of a scientific exascale system. The computational elements have a vector architecture and are capable of performing a comprehensive set of floating-point and integer instructions, predicated operations, and gather-scatter accesses across memory in the Cube. The paper outlines the software infrastructure used to develop applications and to evaluate the architecture, and describes results of experiments on application kernels, along with performance and power projections.