Fighting with the sparsity of the synonymy dictionaries for automatic synset induction
Ustalov, Dmitry
;
Chernoskutov, Mikhail
;
Panchenko, Alexander
;
Biemann, Chris
DOI:
|
https://doi.org/10.1007/978-3-319-73013-4_9
|
URL:
|
https://www.researchgate.net/publication/321976522...
|
Weitere URL:
|
https://link.springer.com/chapter/10.1007%2F978-3-...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2018
|
Buchtitel:
|
Analysis of Images, Social Networks and Texts : 6th International Conference, AIST 2017, Moscow, Russia, July 27-29, 2017, Revised Selected Papers
|
Titel einer Zeitschrift oder einer Reihe:
|
Lecture Notes in Computer Science
|
Band/Volume:
|
10716
|
Seitenbereich:
|
94-105
|
Veranstaltungstitel:
|
Analysis of Images, Social Networks and Texts, AIST 2017
|
Veranstaltungsort:
|
Moscow, Russia
|
Veranstaltungsdatum:
|
July 27-29, 2017
|
Herausgeber:
|
Aalst, Wil M. P. van der
|
Ort der Veröffentlichung:
|
Berlin [u.a.]
|
Verlag:
|
Springer
|
ISBN:
|
978-3-319-73012-7 , 978-3-319-73013-4
|
ISSN:
|
0302-9743 , 1611-3349
|
Verwandte URLs:
|
|
Sprache der Veröffentlichung:
|
Andere
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
lexical semantics , word embeddings , synset induction , synonyms , word sense induction , synset induction , sense embeddings
|
Abstract:
|
Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph. However, such methods are sensitive to the structure of the input synonymy graph: sparseness of the input dictionary can substantially reduce the quality of the extracted synsets. In this paper, we propose two different approaches designed to alleviate the incompleteness of the input dictionaries. The first one performs a pre-processing of the graph by adding missing edges, while the second one performs a post-processing by merging similar synset clusters. We evaluate these approaches on two datasets for the Russian language and discuss their impact on the performance of synset induction methods. Finally, we perform an extensive error analysis of each approach and discuss prominent alternative methods for coping with the problem of sparsity of the synonymy dictionaries.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
BASE:
Ustalov, Dmitry
;
Chernoskutov, Mikhail
;
Panchenko, Alexander
;
Biemann, Chris
Google Scholar:
Ustalov, Dmitry
;
Chernoskutov, Mikhail
;
Panchenko, Alexander
;
Biemann, Chris
ORCID:
Ustalov, Dmitry ORCID: https://orcid.org/0000-0002-9979-2188, Chernoskutov, Mikhail, Panchenko, Alexander ORCID: https://orcid.org/0000-0001-6097-6118 and Biemann, Chris
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|