Feedback-driven multiclass active learning for data streams
Abstract
Active learning is a promising way to efficiently build up training sets with minimal supervision. Most existing methods consider the learning problem in a pool-based setting. However, in a lot of real-world learning tasks, such as crowd-sourcing, the unlabeled samples, arrive sequentially in the form of continuous rapid streams. Thus, preparing a pool of unlabeled data for active learning is impractical. Moreover, performing exhaustive search in a data pool is expensive, and therefore unsuitable for supporting on-the-fly interactive learning in large scale data. In this paper, we present a systematic framework for stream-based multi-class active learning. Following the reinforcement learning framework, we propose a feedback-driven active learning approach by adaptively combining different criteria in a time-varying manner. Our method is able to balance exploration and exploitation during the learning process. Extensive evaluation on various benchmark and real-world datasets demonstrates the superiority of our framework over existing methods. Copyright 2013 ACM.