Scheduling multiple queries on a parallel machine
Abstract
There has been a good deal of progress made recently towards the efficient parallelization of individual phases of single queries in multiprocessor database systems. In this paper we devise and evaluate a number of scheduling algorithms designed to handle multiple parallel queries. One of these algorithms emerges as a clear winner. This algorithm is hierarchical in nature: In the first phase, a good quality precedence-based schedule is created for each individual query and each possible number of processors. This component employs dynamic programming. In the second phase, the results of the first phase are used to create an overall schedule of the full set of queries. This component is based on previously published work on nonprecedence-based malleable scheduling. Even though the problem we are considering is NP-hard in the strong sense, the multiple query schedules generated by our hierarchical algorithm are seen experimentally to achieve results which are close to optimal.