Saurabh Paul, Christos Boutsidis, et al.
JMLR
Mining online discussions to extract answers is an important research problem. Methods proposed in the past used supervised classifiers trained on labeled data. But, collecting training data for each target forum is labor intensive and time consuming, thus limiting their deployment. A recent approach had proposed to extract answers in an unsupervised manner, by taking cues from their repetitions. This assumption however, does not hold true in many cases. In this paper, we propose two semi-supervised methods for extracting answers from discussions, which utilize the large amount of unlabeled data available, alongside a very small training set to obtain improved accuracies. We show that it is possible to boost the performance by introducing a related, but parallel task of identifying acknowledgments to the answers. The accuracy achieved by our approaches surpass the baselines by a wide margin, as shown by our experiments.
Saurabh Paul, Christos Boutsidis, et al.
JMLR
C.A. Micchelli, W.L. Miranker
Journal of the ACM
Ankur Gandhe, Rashmi Gangadharaiah
IJCNLP 2013
Joxan Jaffar
Journal of the ACM