On the Feasibility of Compressing Certifiably Robust Neural Networks
Abstract
Knowledge distillation is a popular approach to compress high-performance neural networks for use in resource-constrained environments. However, the threat of adversarial machine learning poses the question: Is it possible to compress adversarially robust networks and achieve similar or better adversarial robustness as the original network? In this paper, we explore this question with respect to certifiable robustness defenses, in which the defense establishes a formal robustness guarantee irrespective of the adversarial attack methodology. We present our preliminary findings answering two main questions: 1) Is the traditional knowledge distillation sufficient to compress certifiably robust neural networks? and 2) What aspects of the transfer process can we modify to improve the compression effectiveness? Our work represents the first study of the interaction between machine learning model compression and certifiable robustness