Conference paper

A framework for analog-digital mixed-precision neural network training and inference

Abstract

Recent advancements in AI hardware highlight the potential of mixed-signal accelerators, which integrate analog computation for matrix multiplications with reduced-precision digital operations, to achieve superior performance and energy efficiency. In this paper, we present a framework designed to perform hardware-aware training and inference evaluation of neural networks (NNs) on such accelerators. This framework extends an existing toolkit, the IBM Analog Hardware Acceleration Kit (AIHWKit), using a quantization library, enabling flexible layer-wise deployment in either analog or digital units, the latter with configurable precision and quantization options. Our combined framework supports simultaneous quantization-and analog-aware training as well as post-training calibration routines. It can also evaluate the accuracy of NNs when deployed on mixed-signal accelerators. We demonstrate the need of such a framework through ablation studies on a ResNet-based vision model and a BERT-based language model, highlighting the importance of its functionality for maximizing accuracy during deployment. Our contribution is open-sourced as part of the core code of AIHWKit [1].