Code-switching ubique est - Language identification and part-of-speech tagging for historical mixed text
Schulz, Sarah
;
Keller, Mareike

DOI:
|
https://doi.org/10.18653/v1/W16-2105
|
URL:
|
http://anthology.aclweb.org/W16-2105
|
Additional URL:
|
http://aclweb.org/anthology/W16-2100
|
Document Type:
|
Conference or workshop publication
|
Year of publication:
|
2016
|
Book title:
|
Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2016) : August 11, 2016, Berlin, Germany
|
Page range:
|
43-51
|
Conference title:
|
10th SIGHUM Workshop
|
Location of the conference venue:
|
Berlin, Germany
|
Date of the conference:
|
11.08.2016
|
Publisher:
|
Reiter, Nils
|
Place of publication:
|
Stroudsburg, PA
|
Publishing house:
|
Association for Computational Linguistics
|
ISBN:
|
978-1-945626-09-8
|
Publication language:
|
English
|
Institution:
|
School of Humanities > Anglistik IV - Anglistische Linguistik/Diachronie (Trips 2006-)
|
Subject:
|
400 Language, linguistics
|
Individual keywords (German):
|
Linguistische Annotation , Mittelenglisch , Latein
|
Keywords (English):
|
linguistic annotation , POS tagging , code-switching , Middle English , Latin
|
Abstract:
|
In this paper, we describe the development of a language identification system and a part-of-speech tagger for Latin-Middle English mixed text. To this end, we annotate data with language IDs and Universal POS tags (Petrov et al., 2012). As a classifier, we train a conditional random field classifier for both sub-tasks, including features generated by the TreeTagger models of both languages. The focus lies on both a general and a task-specific evaluation. Moreover, we describe our effort concerning beyond proof-of-concept implementation of tools and towards a more task-oriented approach, showing how to apply our techniques in the context of Humanities research.
|
 | Dieser Eintrag ist Teil der Universitätsbibliographie. |
Search Authors in
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
 |
Show item |
|
|