Tagging named entities in Croatian tweets

Baksa, Krešimir ; Golović, Dino ; Glavaš, Goran ; Šnajder, Jan

[img] PDF
document.pdf - Published

Download (238kB)

DOI: https://doi.org/10.4312/slo2.0.2016.1.20-41
URL: https://madoc.bib.uni-mannheim.de/60356
Additional URL: https://revije.ff.uni-lj.si/slovenscina2/article/v...
URN: urn:nbn:de:bsz:180-madoc-603568
Document Type: Article
Year of publication: 2016
The title of a journal, publication series: Slovenščina 2.0
Volume: 4
Issue number: 1
Page range: 20-41
Place of publication: Ljubljana
Publishing house: Ljubljana University Press, Faculty of Arts
ISSN: 2335-2736
Related URLs:
Publication language: English
Institution: School of Business Informatics and Mathematics > Text Analytics for Interdisciplinary Research (Juniorprofessur) (Glavaš 2017-2021)
Pre-existing license: Creative Commons Attribution, Share Alike 4.0 International (CC BY-SA 4.0)
Subject: 004 Computer science, internet
Abstract: Named entity extraction tools designed for recognizing named entities in texts written in standard language (e.g., news stories or legal texts) have been shown to be inadequate for user-generated textual content (e.g., tweets, forum posts). In this work, we propose a supervised approach to named entity recognition and classification for Croatian tweets. We compare two sequence labelling models: a hidden Markov model (HMM) and conditional random fields (CRF). Our experiments reveal that CRF is the best model for the task, achieving a very good performance of over 87% micro-averaged F1 score. We analyse the contributions of different feature groups and influence of the training set size on the performance of the CRF model.
Additional information: Online-Ressource

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Dieser Datensatz wurde nicht während einer Tätigkeit an der Universität Mannheim veröffentlicht, dies ist eine Externe Publikation.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item