Leveraging event-based semantics for automated text simplification
Štajner, Sanja
;
Glavaš, Goran
DOI:
|
https://doi.org/10.1016/j.eswa.2017.04.005
|
URL:
|
http://www.sciencedirect.com/science/article/pii/S...
|
Weitere URL:
|
https://www.researchgate.net/publication/315804776...
|
Dokumenttyp:
|
Zeitschriftenartikel
|
Erscheinungsjahr:
|
2017
|
Titel einer Zeitschrift oder einer Reihe:
|
Expert Systems with Applications
|
Band/Volume:
|
82
|
Seitenbereich:
|
383-395
|
Ort der Veröffentlichung:
|
Amsterdam [u.a.]
|
Verlag:
|
Elsevier
|
ISSN:
|
0957-4174
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-) Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Practical Computer Science II: Artificial Intelligence (Stuckenschmidt 2009-)
|
Fachgebiet:
|
004 Informatik
|
Freie Schlagwörter (Englisch):
|
automated text simplification , event extraction , semantics
|
Abstract:
|
Automated Text Simplification (ATS) aims to transform complex texts into their simpler variants which are easier to understand to wider audiences and easier to process with natural language processing (NLP) tools. While simplification can be applied on lexical, syntactic, and discourse level, all previously proposed ATS systems only operated on the first two levels, thus failing at simplifying texts on the discourse level. We present a semantically-motivated ATS system which is the first system that is applied on the discourse level. By exploiting the state-of-the-art event extraction system, it is the first ATS system able to eliminate large portions of irrelevant information from texts, by maintaining only those parts of the original text that belong to factual event mentions. A few handcrafted rules ensure that the output of the system is syntactically simple, by placing each factual event mention in a separate short sentence, while the state-of-the-art unsupervised lexical simplification module, based on using word embeddings, replaces complex and infrequent words with their simpler variants. We perform a thorough evaluation, both automatic and manual, showing that our system produces more readable and simpler texts than the state-of-the-art ATS systems. Our newly proposed post-editing evaluation further reveals that our system requires less human effort for correcting grammaticality and meaning preservation on news articles than the state-of-the-art ATS system.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|