Human and machine judgements for Russian semantic relatedness


Panchenko, Alexander ; Ustalov, Dmitry ; Arefyev, Nikolay ; Paperno, Denis ; Konstantinova, Natalia ; Loukachevitch, Natalia ; Biemann, Chris



DOI: https://doi.org/10.1007/978-3-319-52920-2_21
URL: https://arxiv.org/abs/1708.09702
Additional URL: https://www.lt.informatik.tu-darmstadt.de/fileadmi...
Document Type: Conference or workshop publication
Year of publication: 2017
Book title: Analysis of Images, Social Networks and Texts : 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers
The title of a journal, publication series: Communications in Computer and Information Science
Volume: 661
Page range: 221-235
Conference title: 5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016
Location of the conference venue: Yekaterinburg, Russia
Date of the conference: April 7-9, 2016
Publisher: Ignatov, Dmitry I.
Place of publication: Cham
Publishing house: Springer
ISBN: 978-3-319-52919-6 , 978-3-319-52920-2
ISSN: 1865-0929 , 1865-0937
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-)
Subject: 004 Computer science, internet
Keywords (English): Semantic similarity , Semantic relatedness , Evaluation , Distributional thesaurus , Crowdsourcing , Language resources
Abstract: Semantic relatedness of terms represents similarity of meaning by a numerical score. On the one hand, humans easily make judgements about semantic relatedness. On the other hand, this kind of information is useful in language processing systems. While semantic relatedness has been extensively studied for English using numerous language resources, such as associative norms, human judgements and datasets generated from lexical databases, no evaluation resources of this kind have been available for Russian to date. Our contribution addresses this problem. We present five language resources of different scale and purpose for Russian semantic relatedness, each being a list of triples (wordi,wordj,similarityij ). Four of them are designed for evaluation of systems for computing semantic relatedness, complementing each other in terms of the semantic relation type they represent. These benchmarks were used to organise a shared task on Russian semantic relatedness, which attracted 19 teams. We use one of the best approaches identified in this competition to generate the fifth high-coverage resource, the first open distributional thesaurus of Russian. Multiple evaluations of this thesaurus, including a large-scale crowdsourcing study involving native speakers, indicate its high accuracy.




Dieser Eintrag ist Teil der Universitätsbibliographie.




Metadata export


Citation


+ Search Authors in

+ Page Views

Hits per month over past year

Detailed information



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item