Case studies in high-dimensional classification
Abstract
We consider the application of several compute-intensive classification techniques to two significant real-world applications: disk drive manufacturing quality control and the prediction of chronic problems in large-scale communication networks. These applications are characterized by very high dimensions, with hundreds of features or tens of thousands of cases. The results of several learning techniques are compared, including linear discriminants, nearest-neighbor methods, decision rules, decision trees, and neural nets. Both applications described in this article are good candidates for rule-based solutions because humans currently resolve these problems, and explanations are critical to determining the causes of faults. While several learning techniques achieved competitive results, machine learning with decision rule inducton was most effective for these applications. It is demonstrated that decision (production) rule induction is practical in high dimensions, providing strong results and insightful explanations. © 1994 Kluwer Academic Publishers.