Using low-discrepancy points for data compression in machine learning: an experimental comparison

Göttlich, Simone ; Heieck, Jacob ; Neuenkirch, Andreas

PDF
s13362-024-00166-5.pdf - Veröffentlichte Version
Download (2MB)

DOI:	https://doi.org/10.1186/s13362-024-00166-5
URL:	https://mathematicsinindustry.springeropen.com/art...
Weitere URL:	https://www.researchgate.net/publication/387719812...
URN:	urn:nbn:de:bsz:180-madoc-686621
Dokumenttyp:	Zeitschriftenartikel
Erscheinungsjahr:	2025
Titel einer Zeitschrift oder einer Reihe:	Journal of Mathematics in Industry
Band/Volume:	15
Heft/Issue:	1
Seitenbereich:	1-25
Ort der Veröffentlichung:	Berlin ; Heidelberg
Verlag:	Springer
ISSN:	2190-5983
Sprache der Veröffentlichung:	Englisch
Einrichtung:	Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Wirtschaftsmathematik II: Stochastische Numerik (Neuenkirch 2013-) Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Scientific Computing (Göttlich 2011-)
Bereits vorhandene Lizenz:	Creative Commons Namensnennung 4.0 International (CC BY 4.0)
Fachgebiet:	510 Mathematik
Fachklassifikation:	MSC: 41A99 , 65C05 , 65D15 , 68T07,
Freie Schlagwörter (Englisch):	data reduction , low-discrepancy points , quasi-Monte Carlo , digital nets , k-means algorithm , neural networks
Abstract:	Low-discrepancy points (also called Quasi-Monte Carlo points) are deterministically and cleverly chosen point sets in the unit cube, which provide an approximation of the uniform distribution. We explore two methods based on such low-discrepancy points to reduce large data sets in order to train neural networks. The first one is the method of Dick and Feischl (J Complex 67:101587, 2021), which relies on digital nets and an averaging procedure. Motivated by our experimental findings, we construct a second method, which again uses digital nets, but Voronoi clustering instead of averaging. Both methods are compared to the supercompress approach of (Stat Anal Data Min ASA Data Sci J 14:217–229, 2021), which is a variant of the K-means clustering algorithm. The comparison is done in terms of the compression error for different objective functions and the accuracy of the training of a neural network.

	Dieser Eintrag ist Teil der Universitätsbibliographie.
	Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Suche Autoren in

BASE: Göttlich, Simone ; Heieck, Jacob ; Neuenkirch, Andreas

Google Scholar: Göttlich, Simone ; Heieck, Jacob ; Neuenkirch, Andreas

ORCID: Göttlich, Simone

; Heieck, Jacob ; Neuenkirch, Andreas

Download-Statistik

Downloads im letzten Jahr

Detaillierte Angaben

Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail

Actions (login required)

Eintrag anzeigen

Using low-discrepancy points for data compression in machine learning: an experimental comparison

Metadaten-Export

Zitation

Suche Autoren in

Download-Statistik

Actions (login required)