high-performance computing , quantitative genetics , genomics , SNP data , GPU programming
Abstract:
Over the last decades, the availability of genetic data has exploded and genomic information is widely used in a variety of fields today. While the cost of genotyping and sequence assembly has been steadily decreasing, software in quantitative genetics has been struggling to keep up with increasing computational demands. Many existing software solutions use strategies for shared-memory parallelism and instruction-level parallelism. However, partly due to a lack of suitable hardware instructions, the dissemination of software that utilizes accelerator hardware has been limited.
In this thesis, novel methods for the efficient processing of genomic data are presented. By utilizing low-precision integer instructions on modern NVIDIA® GPUs, the necessity to decompress SNP data for statistical evaluations is avoided. Due to the memory efficiency of compressed genomic storage formats, datasets of large populations with a high number of SNPs can be analyzed on a single datacenter GPU. The benefits of these new techniques are demonstrated through examples of important quantities in quantitative genetics. First, it is shown that the analytical calculation of population statistics, such as the genomic relationship matrix or linkage disequilibrium, is significantly accelerated compared to existing methods. Second, the numerical evaluation of a single-step BLUP model is used to demonstrate that the use of accelerators can significantly reduce computing times required for estimating genetic values based on iterative-solver methods. Lastly, it is illustrated that the estimation of parameters for an important covariance model can be significantly improved.
Dieser Eintrag ist Teil der Universitätsbibliographie.
Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.