Column property annotation using large language models
Korini, Keti
;
Bizer, Christian

DOI:
|
https://doi.org/10.1007/978-3-031-78952-6_6
|
URL:
|
https://link.springer.com/chapter/10.1007/978-3-03...
|
Weitere URL:
|
https://www.researchgate.net/publication/388437234...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2025
|
Buchtitel:
|
The Semantic Web: ESWC 2024 Satellite Events : Hersonissos, Crete, Greece, May 26–30, 2024, Proceedings, Part I
|
Titel einer Zeitschrift oder einer Reihe:
|
Lecture Notes in Computer Science
|
Band/Volume:
|
15344
|
Seitenbereich:
|
61-70
|
Veranstaltungstitel:
|
ESWC 2024, Extended Semantic Web Conference
|
Veranstaltungsort:
|
Hersonissos, Crete, Greece
|
Veranstaltungsdatum:
|
26.-30.05.2024
|
Herausgeber:
|
Meroño Peñuela, Albert
;
Corcho, Oscar
;
Groth, Paul
;
Simperl, Elena
;
Tamma, Valentina
;
Nuzzolese, Andrea Giovanni
;
Poveda-Villalón, Maria
;
Sabou, Marta
;
Presutti, Valentina
;
Celino, Irene
;
Revenko, Artem
;
Raad, Joe
;
Sartini, Bruno
;
Lisena, Pasquale
|
Ort der Veröffentlichung:
|
Berlin [u.a.]
|
Verlag:
|
Springer
|
ISBN:
|
978-3-031-78951-9 , 978-3-031-78952-6
|
ISSN:
|
0302-9743 , 1611-3349
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems V: Web-based Systems (Bizer 2012-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
table annotation , large language models , column property annotation
|
Abstract:
|
Column property annotation (CPA), also known as column relationship prediction, is the task of predicting the semantic relationship between two columns in a table given a set of candidate relationships. CPA annotations are used in downstream tasks such as data search, data integration, or knowledge graph enrichment. This paper explores the usage of generative large language models (LLMs) for the CPA task. We experiment with different zero-shot prompts for the CPA task which we evaluate using GPT-3.5, GPT-4, and the open-source model SOLAR. We find GPT-3.5 to be quite sensitive to variations of the prompt, while GPT-4 reaches a high performance independent of the variation of the prompt. We further explore the scenario where training data for the CPA task is available and can be used for selecting demonstrations or fine-tuning the model. We show that a fine-tuned GPT-3.5 model outperforms a RoBERTa model that was fine-tuned on the same data by 11% in F1. Comparing in-context learning via demonstrations and fine-tuning shows that the fine-tuned GPT-3.5 performs 9% F1 better than the same model given demonstrations. The fine-tuned GPT-3.5 model also outperforms zero-shot GPT-4 by around 2% F1 for the dataset on which is was fine-tuned, while not generalizing to tasks that require a different vocabulary.
|
 | Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
 |
Eintrag anzeigen |
|