NeuralFP: Out-of-distribution detection using fingerprints of neural networks
Abstract
Edge devices use neural network models learnt on cloud to predict labels of its data records, which may lead to incorrect predictions especially for records that are different from the data involved in the training process, i.e., out-of-distribution (OOD) records. However, recent efforts in OOD detection either require the retraining of the model or assume the existence of a certain amount of OOD records, thus limiting their application in practice. In this work, we propose a novel OOD detection method (named as NeuralFP) without requiring any access to OOD records, which constructs non-linear fingerprints of neural network models memorizing the information of data observed during training. The key idea of NeuralFP is to exploit the difference in how the neural network model responds to data records in its training set versus data records that are anomalous. Specifically, NeuralFP builds autoencoders for each layer of the neural network model and then carefully analyzes the error distribution of the autocoders in reconstructing the training set to identify OOD records. Through extensive experiments on multiple real-world datasets, we show the effectiveness of NeuralFP in detecting OOD records as well as its advantages over previous approaches. Furthermore, we provide useful guidelines for parameter selection in the practical adoption of NeuralFP.