James S. Albus, George A. Bekey, et al.
Science
Two-dimensional contingency or co-occurrence tables arise frequently in important applications such as text, web-log and market-basket data analysis. A basic problem in contingency table analysis is co-clustering: simultaneous clustering of the rows and columns. A novel theoretical formulation views the contingency table as an empirical joint probability distribution of two discrete random variables and poses the co-clustering problem as an optimization problem in information theory - the optimal co-clustering maximizes the mutual information between the clustered random variables subject to constraints on the number of row and column clusters. We present an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages. Using the practical example of simultaneous word-document clustering, we demonstrate that our algorithm works well in practice, especially in the presence of sparsity and high-dimensionality. Copyright 2003 ACM.
James S. Albus, George A. Bekey, et al.
Science
Binny S. Gill, Dharmendra S. Modha
USENIX ATC 2005
Dharmendra S. Modha, Filipp Akopyan, et al.
HCS 2023
Steven K. Esser, Jeffrey L. McKinstry, et al.
ICLR 2020