Data management in the MIRABEL smart grid system
Matthias Boehm, Lars Dannecker, et al.
EDBT/ICDT-WS 2012
A variety of schemes have been proposed in the literature to speed up query processing and analytics by incrementally maintaining a bounded-size uniform sample from a dataset in the presence of a sequence of insertion, deletion, and update transactions. These algorithms vary according to whether the dataset is an ordinary set or a multiset and whether the transaction sequence consists only of insertions or can include deletions and updates. We report on subtle non-uniformity issues that we found in a number of these prior bounded-size sampling schemes, including some of our own. We provide workarounds that can avoid the non-uniformity problem; these workarounds are easy to implement and incur negligible additional cost. We also consider the impact of non-uniformity in practice and describe simple statistical tests that can help detect non-uniformity in new algorithms. © 2013 Springer-Verlag Berlin Heidelberg.
Matthias Boehm, Lars Dannecker, et al.
EDBT/ICDT-WS 2012
Graham Cormode, Minos Garofalakis, et al.
Foundations and Trends in Databases
Peter J. Haas
Communications in Statistics. Part C: Stochastic Models
Paul G. Brown, Peter J. Haas
ICDE 2006