Context-Specific Recommendation System for Predicting Similar PubMed Articles
Abstract
Prioritizing a database of items in response to a given query object is a fundamental task in information retrieval and machine learning. We examine a specific realization of this problem in the context of a collection of biomedical articles. Given a query PubMed article, we investigate the problem of identifying and ranking recommended papers that are topically related to the query article. The two major classes of existing methods for this task are based on Natural Language Processing (NLP) techniques (including algebraic analyses), and those that incorporate structural information among articles, such as their co-citation networks or content similarity. In this paper, we propose a statistically rigorous method, called Context Specific Recommendation System (CSRS), along with associated algorithmic machinery to integrate structural and context-based sources of information to construct a single context-specific interaction network. We utilize this specialized network to rank papers (nodes) in terms of their similarity to query papers. Using a manually curated dataset of PubMed articles, we show that our method significantly outperforms other methods based on either the citation networks or content similarity of articles. Our methods provide a general framework that can be used to integrate other types of relationships into the recommendation process.