Using ChatGPT for Entity Matching
Peeters, Ralph
;
Bizer, Christian
DOI:
|
https://doi.org/10.1007/978-3-031-42941-5_20
|
URL:
|
https://link.springer.com/chapter/10.1007/978-3-03...
|
Weitere URL:
|
https://www.researchgate.net/publication/370594448...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2023
|
Buchtitel:
|
New Trends in Database and Information Systems : ADBIS 2023 short papers, doctoral consortium and workshops: AIDMA, DOING, K-Gals, MADEISD, PeRS, Barcelona, Spain, September 4-7, 2023, Proceedings
|
Titel einer Zeitschrift oder einer Reihe:
|
Communications in Computer and Information Science
|
Band/Volume:
|
1850
|
Seitenbereich:
|
221-230
|
Veranstaltungstitel:
|
ADBIS 2023
|
Veranstaltungsort:
|
Barcelona, Spain
|
Veranstaltungsdatum:
|
04.-07.09.2023
|
Herausgeber:
|
Abelló, Alberto
;
Bugiotti, Francesca
;
Gamper, Jahnn
;
Romero, Oscar
;
Vargas Solar, Genoveva
;
Vassiliadis, Panos
;
Wrembel, Robert
;
Zumpano, Ester
|
Ort der Veröffentlichung:
|
Cham
|
Verlag:
|
Springer
|
ISBN:
|
978-3-031-42940-8
|
ISSN:
|
1865-0929 , 1865-0937 , 978-3-031-42941-5
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems V: Web-based Systems (Bizer 2012-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
Entity Matching , Large Language Models , ChatGPT
|
Abstract:
|
Entity Matching is the task of deciding if two entity descriptions refer to the same real-world entity. State-of-the-art entity matching methods often rely on fine-tuning Transformer models such as BERT or RoBERTa. Two major drawbacks of using these models for entity matching are that (i) the models require significant amounts of fine-tuning data for reaching a good performance and (ii) the fine-tuned models are not robust concerning out-of-distribution entities. In this paper, we investigate using ChatGPT for entity matching as a more robust, training data-efficient alternative to traditional Transformer models. We perform experiments along three dimensions: (i) general prompt design, (ii) in-context learning, and (iii) provision of higher-level matching knowledge. We show that ChatGPT is competitive with a fine-tuned RoBERTa model, reaching a zero-shot performance of 82.35% F1 on a challenging matching task on which RoBERTa requires 2000 training examples for reaching a similar performance. Adding in-context demonstrations to the prompts further improves the F1 by up to 7.85% when using similarity-based example selection. Always using the same set of 10 handpicked demonstrations leads to an improvement of 4.92% over the zero-shot performance. Finally, we show that ChatGPT can also be guided by adding higher-level matching knowledge in the form of rules to the prompts. Providing matching rules leads to similar performance gains as providing in-context demonstrations.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|