Publication
SIGMETRICS/IFIP 2012
Conference paper

Delay tails in MapReduce scheduling

View publication

Abstract

MapReduce/Hadoop production clusters exhibit heavy-tailed characteristics for job processing times. These phenomena are resultant of the workload features and the adopted scheduling algorithms. Analytically understanding the delays under different schedulers for MapReduce can facilitate the design and deployment of large Hadoop clusters. The map and reduce tasks of a MapReduce job have fundamental difference and tight dependence between them, complicating the analysis. This also leads to an interesting starvation problem with the widely used Fair Scheduler due to its greedy approach to launching reduce tasks. To address this issue, we design and implement Coupling Scheduler, which gradually launches reduce tasks depending on map task progresses. Real experiments demonstrate improvements to job response times by up to an order of magnitude. Based on extensive measurements and source code investigations, we propose analytical models for the default FIFO and Fair Scheduler as well as our implemented Coupling Scheduler. For a class of heavy-tailed map service time distributions, i.e., regularly varying of index -a, we derive the distribution tail of the job processing delay under the three schedulers, respectively. The default FIFO Scheduler causes the delay to be regularly varying of index -a+1. Interestingly, we discover a criticality phenomenon for Fair Scheduler, the delay under which can change from regularly varying of index -a to -a+1, depending on the maximum number of reduce tasks of a job. Other more subtle behaviors also exist. In contrast, the delay distribution tail under Coupling Scheduler can be one order lower than Fair Scheduler under some conditions, implying a better performance. © 2012 ACM.

Date

Publication

SIGMETRICS/IFIP 2012

Authors

Share