Mining hidden constrained streams in practice: Informed search in dynamic filter spaces
Abstract
In this paper we tackle the recently proposed problem of hidden streams. In many situations, the data stream that we are interested in, is not directly accessible. Instead, part of the data can be accessed only through applying filters (e.g. keyword filtering). In fact this is the case of the most discussed social stream today, Twitter. The problem in this case is how to retrieve as many relevant documents as possible by applying the most appropriate set of filters to the original stream and, at the same time, respect a number of constrains (e.g. maximum number of filters that can be applied). In this work we introduce a search approach on a dynamic filter space. We utilize heterogeneous filters (not only keywords) making no assumptions about the attributes of the individual filters. We advance current research by considering realistically hard constraints based on real-world scenarios that require tracking of multiple dynamic topics. We demonstrate the effectiveness of our approaches on a set of topics of static and dynamic nature. The development of the approach was motivated by a real application. Our system is deployed in Dublin City's Traffic Management Center and allows the city officers to analyze large sources of heterogeneous data and identify events related to traffic as well as emergencies.