Knowledge graph embeddings: link prediction and beyond

Daniel, Ruffinelli

[img] PDF
Dissertation_Daniel_Ruffinelli.pdf - Published

Download (1MB)

URN: urn:nbn:de:bsz:180-madoc-660210
Document Type: Doctoral dissertation
Year of publication: 2023
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Gemulla, Rainer
Date of oral examination: 22 November 2023
Publication language: English
Institution: School of Business Informatics and Mathematics > Practical Computer Science I: Data Analytics (Gemulla 2014-)
License: CC BY 4.0 Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 004 Computer science, internet
Keywords (English): knowledge graphs , representation learning , machine learning
Abstract: Knowledge graph embeddings, or KGEs, are models that learn vector representations of knowledge graphs. These representations have been used for tasks such as predicting missing links in the graph, or as pre-trained representations that encode structured data for downstream applications, such as question answering or recommender systems. Despite the large amount of models developed for this purpose, the variety in experimental settings has made it difficult to compare results across different studies. Models are often learned using different training and hyperparameter optimization strategies. In addition, most of the literature has focused on a specific form of predicting missing links, known as link prediction. Almost no attention was given to predicting other types of structures in a knowledge graph, and despite their use in downstream applications, there are virtually no studies on the usability of KGE models as pre-trained representations of knowledge graphs. In this thesis, we propose new training and evaluation methods and conduct several large scale empirical studies, all aimed at studying KGE models as a form of knowledge representation. First, we compare model performance in a fair experimental setting that allows us to separate between contributions from new models and those from new training strategies. We find that differences in training approaches, and not necessarily in model architectures, may account for much of the previously reported progress in link prediction. Second, we study some potential limitations that may result from focusing almost exclusively on the link prediction task for KGE research. We find that good link prediction models are not necessarily able to successfully predict missing links in a knowledge graph, and that link prediction performance is not an indication that models generally capture information in the graph. This contradicts the common argument that KGE models are able to generally preserve the structure in a knowledge graph. Finally, we look beyond the link prediction task and study different training objectives aimed at capturing more information in the graph, and the impact that the resulting representations have on downstream applications. We find that models trained with the standard approach based on link prediction do not capture as much information about the graph as possible, and that link prediction performance is also not a good indicator for good downstream performance. These results suggest that the relation between pre-training objectives and downstream performance is not as clear as suggested in the literature, and that more research is needed to better understand how to learn generally useful representations of knowledge graphs.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item