Neuraltran: Optimal data transformation for privacy-preserving machine learning by leveraging neural networks
Abstract
In this work, we develop a new data transformation technique to mediate privacy-preserving access to data while achieving machine learning (ML) tasks. Specifically, we first leverage mutual information in information theory to quantify the utility-providing information (corresponding to any ML task) and the privacy information (could be arbitrary information specified by the users). We further convert the optimization of utility-privacy tradeoff into training a novel neural network (named as NeuralTran) which consists of three modules: transformation module, utility module and privacy module. NeuralTran can be leveraged to automatically transform the input data to ensure that only utility-providing information is kept while the private information is removed. Through extensive experiments on real world datasets, we show the effectiveness of NeuralTran in balancing utility and privacy as well as its advantages over previous approaches.