About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Paper
Probabilistic lossy counting: An efficient algorithm for finding heavy hitters
Abstract
Knowledge of the largest traffic flows in a network is important for many network management applications. The problem of finding these flows is known as the heavy-hitter problem and has been the subject of many studies in the past years. One of the most efficient and well-known algorithms for finding heavy hitters is lossy counting [29]. In this work we introduce probabilistic lossy counting (PLC), which enhances lossy counting in computing network traffic heavy hitters. PLC uses on a tighter error bound on the estimated sizes of traffic flows and provides probabilistic rather than deterministic guarantees on its accuracy. The probabilistic-based error bound substantially improves the memory consumption of the algorithm. In addition, PLC reduces the rate of false positives of lossy counting and achieves a low estimation error, although slightly higher than that of lossy counting. We compare PLC with state-of-the-art algorithms for finding heavy hitters. Our experiments using real traffic traces find that PLC has 1) between 34.4% and 74% lower memory consumption, 2) between 37.9% and 40.5% fewer false positives than lossy counting, and 3) a small estimation error.