Novel techniques for accelerating statistical operations on compressed genomic data

Freudenberg, Alexander

[img] PDF
Dissertation_Freudenberg.pdf - Published

Download (1MB)

URN: urn:nbn:de:bsz:180-madoc-652264
Document Type: Doctoral dissertation
Year of publication: 2023
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Schlather, Martin
Date of oral examination: 1 September 2023
Publication language: English
Institution: School of Business Informatics and Mathematics > Applied Stochastics (Schlather 2012-)
Subject: 500 Science
Keywords (English): high-performance computing , quantitative genetics , genomics , SNP data , GPU programming
Abstract: Over the last decades, the availability of genetic data has exploded and genomic information is widely used in a variety of fields today. While the cost of genotyping and sequence assembly has been steadily decreasing, software in quantitative genetics has been struggling to keep up with increasing computational demands. Many existing software solutions use strategies for shared-memory parallelism and instruction-level parallelism. However, partly due to a lack of suitable hardware instructions, the dissemination of software that utilizes accelerator hardware has been limited. In this thesis, novel methods for the efficient processing of genomic data are presented. By utilizing low-precision integer instructions on modern NVIDIA® GPUs, the necessity to decompress SNP data for statistical evaluations is avoided. Due to the memory efficiency of compressed genomic storage formats, datasets of large populations with a high number of SNPs can be analyzed on a single datacenter GPU. The benefits of these new techniques are demonstrated through examples of important quantities in quantitative genetics. First, it is shown that the analytical calculation of population statistics, such as the genomic relationship matrix or linkage disequilibrium, is significantly accelerated compared to existing methods. Second, the numerical evaluation of a single-step BLUP model is used to demonstrate that the use of accelerators can significantly reduce computing times required for estimating genetic values based on iterative-solver methods. Lastly, it is illustrated that the estimation of parameters for an important covariance model can be significantly improved.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item