About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SoCC 2017
Conference paper
Reducing tail latencies in micro-batch streaming workloads
Abstract
Spark Streaming discretizes streams of data into micro-batches, each of which is further sub-divided into tasks and processed in parallel to improve job throughput. Previous work [2, 3] has lowered end-to-end latency in Spark Streaming. However, two causes of high tail latencies remain unaddressed: 1) data is not load-balanced across tasks, and 2) straggler tasks can increase end-to-end latency by 8 times more than the median task on a production cluster [1].We propose a feedback-control mechanism that allows frameworks to adaptively load-balance workloads across tasks according to their processing speeds. The task runtimes are thus equalized, lowering end-to-end tail latency. Further, this reduces load on machines that have transient resource bottlenecks, thus resolving the bottlenecks and preventing them from having an enduring impact on task runtimes.