Publication
KDD 2021
Workshop paper
Short Text Clustering in Continuous Time Using Stacked Dirichlet-Hawkes Process with Inverse Cluster Frequency Prior
Abstract
Traditional models for short text clustering ignore the time information associated with the text documents. However, existing works have shown that temporal characteristics of streaming documents are significant features for clustering. In this paper we propose a stacked Dirichlet-Hawkes process with inverse cluster frequency prior as a simple but effective solution for the task of short text clustering using temporal features in continuous time. Based on the classical formulation of the Dirichlet-Hawkes process, our model provides an elegant, theoretically grounded and interpretable solution while performing at par with recent state of the art models in short text clustering.