Accurate deep neural network inference using computational phase-change memory
- Vinay Joshi
- Manuel Le Gallo
- et al.
- 2020
- Nature Communications
In-memory computing (IMC) is an emerging non-von Neumann computational paradigm that keeps alive the promise of achieving energy efficiencies on the order of one femtoJoule per operation in a computing system. The key idea is to perform certain computational tasks in place in memory, thereby obviating the need to shuttle data back and forth between the processing and memory units. The time and energy cost associated with this data movement is by far the most severe roadblock for modern computing systems.
IMC is often achieved by exploiting the physical attributes of the memory devices, their array-level organization, etc. IMC has found application in a range of applications such as scientific computing, database query, machine learning etc. However, the most promising application for IMC is for efficient realization of deep neural networks (DNNs) that have revolutionized AI in recent years. A key challenge for DNNs is its computational inefficiency. In fact, the lack of sufficient compute power was one of the key factors that held back progress in the field for almost 30 years. More specialized hardware such as graphical processing units and application specific ICs have had a significant impact on the hardware realization of DNNs and helped unleash the recent AI revolution. However, even with better hardware, there is a significant energy gap to be bridged to unleash the true potential of AI.
In our group we are exploring custom accelerators for DNNs based on IMC. The essential idea is to realized attributes such as synaptic efficacy and plasticity in place in the memory itself by exploiting the physical attributes of memory devices. Physically stationary neurons and synapses along with lower precision analog computation are key salient features of biological neural networks and hence this work also comes under the purview of neuromorphic computing.
We have designed and fabricated, in collaboration with other members of the IBM Research AI Hardware Center, the most advanced IMC compute chips or Analog AI chips for deep learning to date.
We have assembled a highly multidisciplinary team that tackles the entire stack namely, the memory technology, mixed-signal circuit design, system-level architecture, the software stack and applications. As far as memory technology is concerned, our primary focus is improving the compute density, compute precision and weight capacity. There is also substantial research effort aimed at improving the energy efficiency of the peripheral circuitry associated with the IMC cores. The overall system-level architecture of a multi-core IMC chip as well as the appropriate communication fabric is being actively researched. The software stack that spans the hardware-aware training libraries down to the compiler and is being actively developed. Finally, in the application front, we explore neural architectures that are particularly well-suited for this new kind of accelerators. We are also exploring applications that transcend conventional deep learning inference such as more bio-realistic DNNs, neuro-vector symbolic architectures, etc.
Horizon Europe EIC Pathfinder HYBRAIN
Horizon Europe EIC Pathfinder BioPIM
European Research Council (ERC) Consolidator grant
Jain et al., “A heterogeneous and programmable compute-in-memory accelerator architecture for analog-AI using dense 2-D mesh”, IEEE Trans. VLSI (2023)
Le Gallo et al., “Using the IBM analog in-memory hardware acceleration kit for neural network training and inference”, APL Machine Learning (2023)
Rasch et al., “Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators”, Nature Comm. (2023)
Le Gallo et al., “A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference”, Nature Electr. (2023)
Langenegger et al., “In-memory factorization of holographic perceptual representations”, Nature Nanotech. (2023)
Sarwat et al., “Phase-change memtransistive synapses for mixed-plasticity neural computations”, Nature Nanotech. (2022)
Lanza et al., “Memristive technologies for data storage, computation, encryption, and radio-frequency communication”, Science (2022)
Feldmann et al., “Parallel convolutional processing using an integrated photonic tensor core”, Nature (2021)
Sarwat et al., “Projected mushroom type phase-change memory”, Adv. Func. Mat. (2021)
Sebastian et al., “Memory devices and applications for in-memory computing”, Nature Nanotech. (2020)
Karunaratne et al., “In-memory hyperdimensional computing”, Nature Electr. (2020)
Le Gallo and Sebastian, “An overview of phase-change memory device physics”, J. Phys. D: Appl. Phy. (2020)
Joshi et al., “Accurate deep neural network inference using computational phase-change memory”, Nature Comm. (2020)
Boybat et al., “Neuromorphic computing with multi-memristive synapses”, Nature Comm. (2018)