Robert C. Durbeck
IEEE TACON
Defining outliers by their distance to neighboring data points has been shown to be an effective non-parametric approach to outlier detection. In recent years, many research efforts have looked at developing fast distance-based outlier detection algorithms. Several of the existing distance-based outlier detection algorithms report log-linear time performance as a function of the number of data points on many real low-dimensional datasets. However, these algorithms are unable to deliver the same level of performance on high-dimensional datasets, since their scaling behavior is exponential in the number of dimensions. In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional datasets. RBRP scales log-linearly as a function of the number of data points and linearly as a function of the number of dimensions. Our empirical evaluation demonstrates that we outperform the state-of-the-art algorithm, often by an order of magnitude. © 2008 Springer Science+Business Media, LLC.
Robert C. Durbeck
IEEE TACON
Khalid Abdulla, Andrew Wirth, et al.
ICIAfS 2014
Inbal Ronen, Elad Shahar, et al.
SIGIR 2009
Arun Viswanathan, Nancy Feldman, et al.
IEEE Communications Magazine