On the utility of conditional generation based mutual information for characterizing adversarial subspaces
Abstract
Recent studies have found that deep learning systems are vulnerable to adversarial examples; e.g., visually unrecognizable adversarial images can easily be crafted to result in misclassification. The robustness of neural networks has been studied extensively in the context of adversary detection, which compares a metric that exhibits strong discriminate power between natural and adversarial examples. In this paper, we propose to characterize the adversarial subspaces through the lens of mutual information (MI) approximated by conditional generation methods. We use MI as an information-theoretic metric to strengthen existing defenses and improve the performance of adversary detection. Experimental results on Mag-Net defense demonstrate that our proposed MI detector can strengthen its robustness against powerful adversarial attacks.