Irem Boybat-Kara
IEDM 2023
Recent advances in AI systems (software and hardware) have resulted in an unprecedented growth in demand for AI-infused computational models across a very broad range of application domains. Predictably, challenges in efficiency, reliability, security (including privacy), trustworthiness and safety (of AI systems) have emerged concurrently as major themes of R&D. Closely tied to the challenge of energy efficiency is the issue of sustainable cost, as evidenced by the recent sensation caused by the emergence of DeepSeek AI. Agile co-design of hardware and software for AI is a key element of efficiency enhancement and the drive towards sustainable AI compute. To this end, this tutorial will focus on agile design of secure and resilient AI (SARA) systems.
Our methodology builds on prior work on agile domain-specific system-on-chip (DSSoC) design during the 5-year EPOCHS project led by IBM (DARPA-sponsored, with Columbia University participating as one of the key university partners). At the core of the EPOCHS project is ESP, an open-source platform for heterogeneous SoC design from Columbia University. By combining a scalable, modular, tile-based architecture with a flexible system-level design methodology, ESP simplifies the design of individual accelerators and automates their hardware/software integration into complete SoCs. This tutorial will begin with a hands-on introduction to ESP. We will detail how ESP was used to successfully design two complex heterogeneous SoCs [3], [5] during the EPOCHS project and also demonstrate the software stack that runs on our hardware prototypes. The novel distributed hardware power management architecture [5, 6] will also be covered, detailing the pre-silicon modeling and design challenges.
In the second part of the tutorial, we will pivot to the SARA application domain, detailing key challenges in “efficient resilience” such as side channel attack mitigation and data security [9], as well as data integrity or inferential accuracy shortfalls under low power constraints [10-12]. We will then present our ongoing work to design large systems (e.g. design proposals [7, 8] from IBM or other recent FHE hardware accelerator papers published at top-tier architecture conference) that have support for data-secure AI. This work includes enhancements to ESP to make its NoC-based infrastructure more flexible and performant, which is critical for these complex applications. We will also present the software stack that enables the deployment of SARA workloads across multiple accelerator instances; this includes both privacy-preserving computing, as well as plaintext inference with CNN and Transformer networks. We will wrap up by presenting a vision for future SARA systems-in-package (SiPs) that are composed of multiple chiplets. A high-level outline of the proposed tutorial lecture plan is provided in Appendix-I of this proposal document.
SUMMARY REFERNCE LIST (Key papers co-authored by the speakers; many other papers in the field are covered, but not listed here) [1] P. Mantovani et al. Agile SoC Development with Open ESP, invited paper, (International Conference on Computer-Aided Design (ICCAD), 2020. [2] J. Zuckerman et al. Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs. International Symposium on Microarchitecture (MICRO), 2021. [3] T. Jia et al. A 12nm Agile-Designed SoC for Swarm-Based Perception with Heterogeneous IP Blocks, a Reconfigurable Memory Hierarchy, and an 800MHz Multi-Plane NoC. Proceedings of the 48th European Solid-State Circuits Conference (ESSCIRC), 2022. [4] M. Cassel dos Santos et al. A Scalable Methodology for Agile Chip Development with Open-Source Hardware Components , The Proceedings of the International Conference on Computer-Aided Design (ICCAD), 2022. [5] M. Cassel dos Santos et al. A 12nm Linux-SMP-Capable RISC-V SoC with 14 accelerator types, distributed hardware power management and flexible NoC-based data orchestration. International Solid-State Circuits Conference (ISSCC), 2024. [6] M. Cochet et al. BlitzCoin: Fully Decentralized Hardware Power Management for Accelerator-Rich SoCs. International Symposium on Computer Architecture (ISCA), 2024. [7] Y. Park et al. Dramaton: A Near-DRAM Accelerator for Large Number Theoretic Transforms. IEEE Comput. Archit. Lett. 23(1): 108-111 (2024). [8] Y. Park et al., FHENDI: A Near-DRAM Accelerator for Compiler-Generated Fully Homomorphic Encryption Applications. International Symposium on High Performance Computer Architecture (HPCA), 2025 (to appear). [9] E. Aharoni et al. Efficient Pruning for Machine Learning Under Homomorphic Encryption. ESORICS (4) 2023. [10] D. Stutz et al. Bit Error Robustness for Energy-Efficient DNN Accelerators. MLSys 2021. [11] Z. Wan et al. BERRY: Bit Error Robustness for Energy-Efficient Reinforcement Learning-Based Autonomous Systems. DAC 2023: 1-6. [12] Z. Wan, et al. MulBERRY: Enabling Bit-Error Robustness for Energy-Efficient Multi-Agent Autonomous Systems. ASPLOS (2) 2024: 746-762.