Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory
Abstract
Improving the vulnerability to soft errors is one of the important design goals for future architecture design of Chip-MultiProcessors (CMPs). In this study, we explore the soft error characteristics of CMPs with 3D stacked NonVolatile Memory (NVM), in particular, the Spin-Transfer Torque Random Access Memory (STTRAM), whose cells are immune to radiation-induced soft errors and do not have endurance problems.We use 3D stacking as an enabler for modular integration of STT-RAM memories with minimum disruption in the baseline processor design flow, while providing further interconnection and capacity advantages. We take an in-depth look at alternative replacement schemes to explore the soft error resilience benefits and design trade-offs of 3D stacked STT-RAM and capture the multivariable optimization challenges microprocessor architectures face. We propose a vulnerability metric, with respect to the instruction and data in the core pipeline and through the cache hierarchy, to present a comprehensive system evaluation with respect to reliability, performance, and power consumption for our CMP architectures. Our experimental results show that, for the average workload, replacing memories with an STT-RAM alternative significantly mitigates soft errors on-chip, improves the performance by 14.15%, and reduces power consumption by 13.44%. © 2013 ACM.