Small selectivities matter: Lifting the burden of empty samples


Hertzschuch, Axel ; Moerkotte, Guido ; Lehner, Wolfgang ; May, Norman ; Wolf, Florian ; Fricke, Lars



DOI: https://doi.org/10.1145/3448016.3452805
URL: https://dl.acm.org/doi/10.1145/3448016.3452805
Additional URL: https://www.researchgate.net/publication/352535983...
Document Type: Conference or workshop publication
Year of publication: 2021
Book title: Proceedings of the 2021 International Conference on Management of Data : SIGMOD/PODS '21
Page range: 697-709
Conference title: SIGMOD/PODS '21
Location of the conference venue: Online
Date of the conference: 20.-25.06.2021
Publisher: Li, Guoliang ; Li, Zhanhuai
Place of publication: New York, NY
Publishing house: Association for Computing Machinery
ISBN: 978-1-4503-8343-1
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science III (Moerkotte 1996-)
Subject: 004 Computer science, internet
Abstract: Every year more and more advanced approaches to cardinality estimation are published, using learned models or other data and workload specific synopses. In contrast, the majority of commercial in-memory systems still relies on sampling. It is arguably the most general and easiest estimator to implement. While most methods do not seem to improve much over sampling-based estimators in the presence of non-selective queries, sampling struggles with highly selective queries due to limitations of the sample size. Especially in situations where no sample tuple qualifies, optimizers fall back to basic heuristics that ignore attribute correlations and lead to large estimation errors. In this work, we present a novel approach, dealing with these 0-Tuple Situations. It is ready to use in any DBMS capable of sampling, showing a negligible impact on optimization time. Our experiments on real world and synthetic data sets demonstrate up to two orders of magnitude reduced estimation errors. Enumerating single filter predicates according to our estimates reveals 1.3 to 1.8 times faster query responses for complex filters.




Dieser Eintrag ist Teil der Universitätsbibliographie.




Metadata export


Citation


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item