Matching techniques in global schema design
Michael V. Mannino, Wolfgang Effelsberg
ICDE 1984
We present a novel preprocessing technique for distributed query optimization which achieves nearly all of the benefits that full database reduction by semijoin achieves, but for a small fraction of the cost. Most previous researchers have approached the problem of distributed query optimization with the idea of achieving maximum database reduction at any cost. We take an engineering approach-that, past a point of diminishing returns, database reduction is not cost-effective. We introduce a technique called 'partial reduction', which uses a sequence of 'bucket semijoins' to eliminate nearly all of the irrelevant tuples eliminated by a fully reducing sequence of semijoins. We provide a performance analysis for a simple binary semijoin compared to a bucket semijoin, and show that a bucket semijoin can achieve, say, 90% of the reduction that a semijoin can, but for, say, 10% of the cost. Although we used some simplifying assumptions in our analysis, we argue that relaxing these assumptions does not diminish our results in any way. We discuss the sensitivity of our technique to its parameters, and we show that our results improve for multiple join and inequality join queries.
Michael V. Mannino, Wolfgang Effelsberg
ICDE 1984
Sakti P. Ghosh
ICDE 1984
Kyu-Young Whang, Art Ammann, et al.
ACM Transactions on Information Systems (TOIS)