Computer Sciences Dept.

A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment

Donovan A Schneider and David J DeWitt

In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based methods. Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.

Download this report (PDF)

Return to tech report index

Computer Science | UW Home