About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
We previously discussed how classifiers based on logistic regression and decision trees can be used for predicting the class of an observation. Unfortunately, when such classifiers are trained on a dataset in which one of the response classes is rare, they can underestimate the probability of observing a rare event — the greater the imbalance, the greater this small-sample bias. This month, we illustrate how to mitigate the negative effect of class imbalance on the training of classifiers.