About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ISCAS 2022
Conference paper
Analog-memory-based 14nm Hardware Accelerator for Dense Deep Neural Networks including Transformers
Abstract
Analog non-volatile memory (NVM)-based accelerators for deep neural networks perform high-throughput and energy-efficient multiply-accumulate (MAC) operations (e.g., high TeraOPS/W) by taking advantage of massively parallelized analog MAC operations, implemented with Ohm's law and Kirchhoff's current law on array-matrices of resistive devices. While the wide-integer and floating-point operations offered by conventional digital CMOS computing are much more suitable than analog computing for conventional applications that require high accuracy and true reproducibility, deep neural networks can still provide competitive end-to-end results even with modest (e.g., 4-bit) precision in synaptic operations. In this paper, we describe a 14-nm inference chip, comprising multiple 512 times 512 arrays of Phase Change Memory (PCM) devices, which can deliver software-equivalent inference accuracy for MNIST handwritten-digit recognition and recurrent LSTM benchmarks, by using compensation techniques to finesse analog-memory challenges such as conductance drift and noise. We also project accuracy for Natural Language Processing (NLP) tasks performed with a state-of-art large Transformer-based model, BERT, when mapped onto an extended version of this same fundamental chip architecture.