About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
Sparse PCA is a cardinal technique for obtaining combinations of features that explain the variance in high-dimensional datasets in an interpretable manner. Most works either analyze the single principal component case, assume that all PCs share the same support or are fully disjoint, which allows the orthogonality constraint to be omitted and simplifies the problem dramatically. By reformulating sparse PCA as a sparsity and rank constrained optimization problem, we design exact, approximate, and feasible methods and second-order cone and semidefinite relaxations that collectively obtain bound gaps on the order of 5% for real-world datasets with 100s or 1000s of features, and demonstrate that considering the orthogonality and sparsity constraints simultaneously can lead to improvements in the Area Under the ROC curve of 14%-20% compared to deflation methods.