alpha to omega: the G(r)eek alphabet of sampling
Moerkotte, Guido
;
Hertzschuch, Axel
URL:
|
https://madoc.bib.uni-mannheim.de/58070
|
Weitere URL:
|
http://cidrdb.org/cidr2020/papers/p25-moerkotte-ci...
|
URN:
|
urn:nbn:de:bsz:180-madoc-580704
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2020
|
Buchtitel:
|
CIDR 2020 : 10th Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, January 12-15, 2020, Online Proceedings
|
Seitenbereich:
|
1-15
|
Veranstaltungstitel:
|
CIDR 2020
|
Veranstaltungsort:
|
Online
|
Veranstaltungsdatum:
|
12.-15.01.2020
|
Ort der Veröffentlichung:
|
Amsterdam
|
Verlag:
|
CIDR
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Practical Computer Science III (Moerkotte 1996-)
|
Bereits vorhandene Lizenz:
|
Creative Commons Namensnennung 3.0 Unported (CC BY 3.0)
|
Fachgebiet:
|
004 Informatik
|
Abstract:
|
Sampling is the most versatile and easiest to implement car-dinality estimation method. Therefore, it is implemented inalmost every database management system, commercial ornot. Consequently, the main purpose of this paper is to pro-vide the reader with an intuition about sampling precision.In the context of query optimization, the basic procedurecan be described as follows. From a relationRcontainingntuples, a sample ofm < ntuples is drawn. Then, a querypredicatepis evaluated on themsample tuples, and thenumberkof qualifying sample tuples is recorded. Assumethe evaluation of the same predicatepon the relationRresults inlqualifying tuples. The task now is to producean estimateˆlforlwheren,m,kare given. The standardanswer to this task isˆl=knm. However, there are some (yetunanswered) fundamental questions:1. Is the standard estimator the best way to derive anestimate?2. What are the upper and lower bounds forl?3. How can we derive an estimate that minimizes the q-error?4. How large is the q-error we can expect for this esti-mate?5. For a given maximal allowed q-error, which sample sizemshould we choose?Since sampling is a probabilistic process, we will give prob-abilistic answers to these questions. Further, we show howresult cardinality estimates for selections and joins can significantly be improved.
|
Zusätzliche Informationen:
|
Originalt.: α to ω
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
| Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|