Memory-efficient key/Foreign-key join size estimation via multiplicity and intersection size


Müller, Magnus ; Flachs, Daniel ; Moerkotte, Guido



DOI: https://doi.org/10.1109/ICDE51399.2021.00090
URL: https://store.computer.org/csdl/proceedings-articl...
Additional URL: https://www.researchgate.net/publication/352675505...
Document Type: Conference or workshop publication
Year of publication: 2021
Book title: 2021 IEEE 37th International Conference on Data Engineering (ICDE)
Page range: 984-995
Conference title: ICDE 2021
Location of the conference venue: Online
Date of the conference: 19.-22.04.2021
Publisher: Ailamaki, Anastasia ; Garofalakis, Minos N.
Place of publication: Chania
Publishing house: IEEE
ISBN: 78-1-7281-9185-0 , 978-1-7281-9184-3
ISSN: 1063-6382 , 2375-026X
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science III (Moerkotte 1996-)
Subject: 004 Computer science, internet
Keywords (English): query processing , conferences , estimation , data structures , data engineering
Abstract: Join size estimation plays a crucial role in query optimization. In this paper, we present a technique to estimate the size of a key/foreign-key join of two filtered relations. We build on a model by Allen Van Gelder, in which there is no notion of join selectivity. Instead, the size of a join is estimated as a multiple of the intersection size of the join attributes. We present both a data structure to approximate the number of distinct values in a join attribute after a filter operation, and formulas to estimate the factor by which a join size exceeds the intersection size. In addition, we evaluate three existing intersection size estimation methods that are based on HyperLogLog sketches, to which our approach is closely linked. For both real-world and generated data sets, our estimator competes well, in terms of accuracy and memory footprint, against several industry-strength and state-of-the-art join size estimation methods. In particular, our experiments indicate that our approach is less prone to heavy underestimates.




Dieser Eintrag ist Teil der Universitätsbibliographie.




Metadata export


Citation


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item