Intelligently querying incomplete instances for improving classification performance
Abstract
The problem of intelligently acquiring missing input information given a limited number of queries to enhance classification performance has gained substantial interest in the last decade or so. This is primarily due to the emergence of the targeted advertising industry, which is trying to best match products to its potential consumer base in the absence of complete consumer profile information. In this paper, we propose a novel active feature acquisition technique, to tackle this problem of instance completion prevalent in these domains. We show theoretically that our technique is optimal given the current classifier and derive a probabilistic lower bound on the error reduction achieved with our technique. We also show that a simplification of our technique is equivalent to the expected utility approach, which is one of the most sophisticated solutions for this problem in existing literature. We then demonstrate the efficacy of our approach through experiments on real data. Finally, we show that our technique can be easily extended to the scenario where we have a cost matrix associated with acquiring missing information for each instance or instance-feature combinations. Copyright 2013 ACM.