Learning distributional token representations from visual features

Broscheit, Samuel ; Gemulla, Rainer ; Keuper, Margret

Learning Distributional Token Representations from Visual Features.pdf - Published

Download (334kB)

URL: https://madoc.bib.uni-mannheim.de/45649
Additional URL: http://aclweb.org/anthology/W18-3025
URN: urn:nbn:de:bsz:180-madoc-456495
Document Type: Conference or workshop publication
Year of publication: 2018
Book title: ACL 2018, Representation Learning for NLP : Proceedings of the Third Workshop : July 20, 2018 Melbourne, Australia
Page range: 187-194
Conference title: 3rd Workshop on Representation Learning for NL
Location of the conference venue: Melbourne, Australia
Date of the conference: 20.7.2018
Publisher: Augenstein, Isabelle
Place of publication: Stroudsburg, PA
Publishing house: Association for Computational Linguistics
ISBN: 978-1-948087-43-8
Publication language: English
Institution: School of Business Informatics and Mathematics > Praktische Informatik I (Gemulla 2014-)
License: CC BY 4.0 Creative Commons Attribution 4.0 International (CC BY 4.0)
Subject: 004 Computer science, internet
Abstract: In this study, we compare token representations constructed from visual features (i.e., pixels) with standard lookup-based embeddings. Our goal is to gain insight about the challenges of encoding a text representation from low-level features, e.g. from characters or pixels. We focus on Chinese, which—as a logographic language—has properties that make a representation via visual features challenging and interesting. To train and evaluate different models for the token representation, we chose the task of character-based neural machine translation (NMT) from Chinese to English. We found that a token representation computed only from visual features can achieve competitive results to lookup embeddings. However, we also show different strengths and weaknesses in the models’ performance in a part-of- speech tagging task and also a semantic similarity task. In summary, we show that it is possible to achieve a text representation only from pixels. We hope that this is a useful stepping stone for future studies that exclusively rely on visual input, or aim at exploiting visual features of written language.

Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.

Metadata export


+ Search Authors in

+ Download Statistics

Downloads per month over past year

View more statistics

You have found an error? Please let us know about your desired correction here: E-Mail

Actions (login required)

Show item Show item