Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators∗Swagath VenkataramaniJungwook Choiet al.2019IISWC 2019
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI AcceleratorSwagath VenkataramaniJungwook Choiet al.2019IEEE Micro
Dynamic Spike Bundling for Energy-Efficient Spiking Neural NetworksSarada KrithivasanSanchari Senet al.2019ISLPED 2019
BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networksShubham JainSwagath Venkataramaniet al.2019DAC 2019
SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural NetworksSanchari SenShubham Jainet al.2019IEEE TC
A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation KernelEri OgawaKazuaki Ishizakiet al.2019COOL CHIPS 2019
Data Subsetting: A Data-Centric Approach to Approximate ComputingYounghoon KimSwagath Venkataramaniet al.2019DATE 2019
A Scalable Multi-TeraOPS Core for AI Training and InferenceSunil ShuklaBruce Fleischeret al.2018IEEE SSC-L
A Scalable Multi-TeraOPS Deep Learning Processor Core for AI Trainina and InferenceBruce FleischerSunil Shuklaet al.2018VLSI Circuits 2018
DyHard-DNN: Even more DNN acceleration with dynamic hardware reconfigurationMateja PuticAlper Buyuktosunogluet al.2018DAC 2018
17 Feb 2020US10565285Processor And Memory Transparent Convolutional Lowering And Auto Zero Padding For Deep Neural Network Implementations
MOMori OharaDeputy Director, IBM Research Tokyo, Distinguished Engineer, Chief SW Engineer for Hybrid Cloud on IBM HW