About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IJCAI 2016
Conference paper
Weight features for predicting future model performance of deep neural networks
Abstract
Deep neural networks frequently require the careful tuning of model hyperparameters. Recent research has shown that automated early termination of underperformance runs can speed up hyperparameter searches. However, these studies have used only learning curve for predicting the eventual model performance. In this study, we propose using weight features extracted from network weights at an early stage of the learning process as explanation variables for predicting the eventual model performance. We conduct experiments on hyperparameter searches with various types of convolutional neural network architecture on three image datasets and apply the random forest method for predicting the eventual model performance. The results show that use of the weight features improves the predictive performance compared with use of the learning curve. In all three datasets, the most important feature for the prediction was related to weight changes in the last convolutional layers. Our findings demonstrate that using weight features can help construct prediction models with a smaller number of training samples and terminate underperformance runs at an earlier stage of the learning process of DNNs than the conventional use of learning curve, thus facilitating the speed-up of hyperparameter searches.