DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator
Abstract
The ubiquitous adoption of systems specialized for AI requires bridging two seemingly conflicting challenges-the need to deliver extreme processing efficiencies while employing familiar programming interfaces, making them compelling even for non-expert users. We take a significant first step towards this goal and present an end-to-end software stack for the RaPiD AI accelerator developed by IBM Research. We present a set of software extensions, called Deeptools, that leverage and work within popular deep learning frameworks. DeepTools requires no additional user input and enables aggressive, accelerator-specific performance optimization akin to a full, custom framework. DeepTools has two key components: 1) a compiler runtime called DeepRT, which automatically identifies how best to execute a given DNN graph on RaPiD and constructs the requisite program binaries; and 2) an execution runtime called RaPiDLib, which triggers and manages the execution of compute and data-transfer operations on RaPiD. We integrate DeepTools with TensorFlow and map popular DNNs (AlexNet, VGG, ResNet, LSTM) to RaPiD. We demonstrate substantial improvement in performance over hand-tuned mappings.