Subscription Covering for Relevance-Based Filtering in Content-Based Publish/Subscribe Systems
Abstract
Large-scale applications require a scalable data dissemination service with advanced filtering capabilities. We propose the use of a content-based publish/subscribe system with support for top-k filtering in the context of such applications. We focus on the problem of top-k subscription filtering, where a publication is delivered only to the k highest scoring subscribers. The naive approach to perform filtering early at the publisher edge works only if complete knowledge of the subscriptions is available, which is not compatible with the well-established covering optimization in scalable content-based publish/subscribe systems. We propose an efficient rank-cover technique to reconcile top-k subscription filtering with covering. We extend the covering model to support top-k and describe a novel algorithm for forwarding subscriptions to publishers while maintaining correctness. Finally, we compare our solutions to a baseline covering system. In a typical setting, our optimized solution is scalable and provides over 81% of the covering benefit.