|
The accuracy of cardinality estimators: Unraveling the evaluation result conundrum
Rashedi, Nazanin
;
Moerkotte, Guido
![[img]](https://madoc.bib.uni-mannheim.de/71638/1.hassmallThumbnailVersion/3749646.3749651.pdf)  Vorschau |
|
PDF
3749646.3749651.pdf
- Veröffentlichte Version
Download (2MB)
|
|
DOI:
|
https://doi.org/10.14778/3749646.3749651
|
|
URL:
|
https://dl.acm.org/doi/10.14778/3749646.3749651
|
|
Weitere URL:
|
https://www.researchgate.net/publication/395274986...
|
|
URN:
|
urn:nbn:de:bsz:180-madoc-716385
|
|
Dokumenttyp:
|
Zeitschriftenartikel
|
|
Erscheinungsjahr:
|
2025
|
|
Titel einer Zeitschrift oder einer Reihe:
|
Proceedings of the VLDB Endowment
|
|
Band/Volume:
|
18
|
|
Heft/Issue:
|
11
|
|
Seitenbereich:
|
3744-3756
|
|
Ort der Veröffentlichung:
|
New York, NY
|
|
Verlag:
|
Association of Computing Machinery
|
|
ISSN:
|
2150-8097
|
|
Sprache der Veröffentlichung:
|
Englisch
|
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Practical Computer Science III (Moerkotte 1996-)
|
|
Bereits vorhandene Lizenz:
|
Creative Commons Namensnennung, nicht kommerziell, keine Bearbeitung 4.0 International (CC BY-NC-ND 4.0)
|
|
Fachgebiet:
|
004 Informatik
|
|
Abstract:
|
Existing research on the accuracy of cardinality estimators generally suffers from a lack of diversity and sufficient quantity of their experimental datasets, particularly in relation to the claimed scope of the study and the generality of its conclusions. We argue that a sufficiently large number of varied datasets are essential for comprehensive evaluations. However, the prevailing per-dataset evaluation method (PDE), producing one result table per dataset, has so far hindered this necessary expansion of the experiments. Moreover, as we demonstrate, this evaluation method often leaves the reader with contradictory results, where one estimator excels on certain datasets or queries, while the other performs better elsewhere. To address these and similar limitations, we propose a multidimensional evaluation framework. This framework unravels the conundrum of analyzing the evaluation results across multiple datasets through the use of discretization. It establishes a robust foundation for aggregating the evaluation results and conducting pairwise comparisons between estimators. Furthermore, it facilitates informed decision making in the presence of conflicting results through a customizable ranking mechanism.
To empirically highlight the shortcomings of the aforementioned per-dataset evaluation and demonstrate the advantages of our proposed framework, we conduct a benchmarking study of cardinality estimators, incorporating both learned and traditional approaches.
We focus on a fundamental challenge: estimating the cardinality of range queries on a single 2-D geographical relation in a static environment. Despite the apparent simplicity of this task, our findings reveal that many estimators struggle to handle this challenge effectively. To further enhance the quality of our study, we provide valuable insights by addressing some critical aspects that were overlooked in previous benchmarking studies.
|
 | Dieser Eintrag ist Teil der Universitätsbibliographie. |
 | Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
 |
Eintrag anzeigen |
|
|