Multiple Instance Learning on structured data
Dan Zhang, Yan Liu, et al.
NeurIPS 2011
We participated in the triage task of biomedical documents in the TREC genomic track. In this paper we describe the methods we developed for the four triage1subtasks. Logistic regression and support vector machine algorithms were first trained to generate ranked lists of test documents. Then a subset of the test documents was identified as positive instances by selecting the top-k documents of the ranked lists. Deciding on the ideal value for k requires a good thresholding strategy. In this paper we first describe two thresholding strategies based on i) logistic regression and ii) support vector machines. In addition to these methods, we describe a thresholding method that combines the outputs from logistic regression and support vector machine by applying a joint thresholding strategy.
Dan Zhang, Yan Liu, et al.
NeurIPS 2011
Tapas Kanungo, David M. Mount, et al.
SCG 2002
Stephen Dill, Nadav Eiron, et al.
Web Semantics
Rie Kubota Ando, Mark Dredze, et al.
TREC 2005