The bionic DBMS is coming, but what will it look like?
Ryan Johnson, Ippokratis Pandis
CIDR 2013
Multiway data analysis deals with multiway ar-rays, i.e., tensors, and the goal is twofold: pre-dicting missing entries by modeling the inter-actions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor factorization approaches, they are either unable to capture nonlinear interactions, or com-putationally expensive to handle massive data. In addition, most of the existing methods lack a principled way to discover latent clusters, which is important for better understanding of the data. To address these issues, we propose a scalable nonparametric tensor decomposition model. It employs Dirichlet process mixture (DPM) prior to model the latent clusters; it uses local Gaussian processes (GPS) to capture nonlinear relation-ships and to improve scalability. An efficient on-line variational Bayes Expectation-Maximization algorithm is proposed to learn the model. Ex-periments on both synthetic and real-world data show that the proposed model is able to discover latent clusters with higher prediction accuracy than competitive methods. Furthermore, the pro-posed model obtains significantly better predic-tive performance than the state-of-the-art large scale tensor decomposition algorithm, GigaTen-sor, on two large datasets with billions of entries.
Ryan Johnson, Ippokratis Pandis
CIDR 2013
Kenneth L. Clarkson, Elad Hazan, et al.
Journal of the ACM
John R. Kender, Rick Kjeldsen
IEEE Transactions on Pattern Analysis and Machine Intelligence
Aditya Malik, Nalini Ratha, et al.
CAI 2024