Extending a multilingual lexical resource by bootstrapping named entity classification using Wikipedia's category system


Knopp, Johannes


[img]
Preview
PDF
Knopp11ExtendingHeiNER.pdf - Published

Download (345kB)

URL: https://madoc.bib.uni-mannheim.de/29542
Additional URL: https://www.aclweb.org/anthology/W11-3607/
URN: urn:nbn:de:bsz:180-madoc-295426
Document Type: Conference or workshop publication
Year of publication: 2011
Book title: Proceedings of the Fifth International Workshop On Cross Lingual Information Access
Page range: 35-43
Conference title: Fifth International Workshop On Cross Lingual Information Access
Location of the conference venue: Chiang Mai, Thailand
Date of the conference: 8.-13. Nov 2011
Place of publication: Chiang Mai, Thailand
Publishing house: Asian Federation of Natural Language Processing
Publication language: English
Institution: School of Business Informatics and Mathematics > Praktische Informatik II (Stuckenschmidt 2009-)
Subject: 004 Computer science, internet
Classification: CCS:
Individual keywords (German): Named Entities, Wikipedia, HeiNER, NERC
Abstract: Named Entity Recognition and Classification (NERC) is a well-studied NLP task which is typically approached using machine learning algorithms that rely on training data whose creation usually is expensive. The high costs result in the lack of NERC training data for many languages. An approach to create a multilingual NE corpus was presented in Wentland et al. (2008). The resulting resource called HeiNER describes a valuable number of NEs but does not include their types. We present a bootstrap approach based on Wikipedia’s category system to classify the NEs contained in HeiNER that is able to classify more than two million named entities to improve the resource’s quality.
Additional information: Online-Ressource

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




+ Citation Example and Export

Knopp, Johannes Extending a multilingual lexical resource by bootstrapping named entity classification using Wikipedia's category system. Open Access 35-43 In: Proceedings of the Fifth International Workshop On Cross Lingual Information Access (2011) Chiang Mai, Thailand Fifth International Workshop On Cross Lingual Information Access (Chiang Mai, Thailand) [Conference or workshop publication]
[img]
Preview


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item