Using object detection, NLP, and knowledge bases to understand the message of images
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura
DOI:
|
https://doi.org/10.1007/978-3-319-51814-5_34
|
URL:
|
http://link.springer.com/chapter/10.1007/978-3-319...
|
Weitere URL:
|
https://www.springerprofessional.de/using-object-d...
|
Dokumenttyp:
|
Konferenzveröffentlichung
|
Erscheinungsjahr:
|
2017
|
Buchtitel:
|
MultiMedia Modeling : 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part II
|
Titel einer Zeitschrift oder einer Reihe:
|
Lecture Notes in Computer Science
|
Band/Volume:
|
10133
|
Seitenbereich:
|
405-418
|
Veranstaltungstitel:
|
23rd International Conference, MMM 2017
|
Veranstaltungsort:
|
Reykjavik, Iceland
|
Veranstaltungsdatum:
|
04-07 January 2017
|
Herausgeber:
|
Amsaleg, Laurent
|
Ort der Veröffentlichung:
|
Berlin [u.a.]
|
Verlag:
|
Springer
|
ISBN:
|
978-3-319-51813-8 , 978-3-319-51814-5
|
ISSN:
|
0302-9743 , 1611-3349
|
Sprache der Veröffentlichung:
|
Englisch
|
Einrichtung:
|
Fakultät für Wirtschaftsinformatik und Wirtschaftsmathematik > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-)
|
Fachgebiet:
|
004 Informatik
|
Abstract:
|
With the increasing amount of multimodal content from social media
posts and news articles, there has been an intensified effort towards conceptual
labeling and multimodal (topic) modeling of images and of their affiliated texts.
Nonetheless, the problem of identifying and automatically naming the core abstract
message (gist) behind images has received less attention. This problem is
especially relevant for the semantic indexing and subsequent retrieval of images.
In this paper, we propose a solution that makes use of external knowledge bases
such as Wikipedia and DBpedia. Its aim is to leverage complex semantic associations
between the image objects and the textual caption in order to uncover
the intended gist. The results of our evaluation prove the ability of our proposed
approach to detect gist with a best MAP score of 0.74 when assessed against
human annotations. Furthermore, an automatic image tagging and caption generation
API is compared to manually set image and caption signals. We show and
discuss the difficulty to find the correct gist especially for abstract, non-depictable
gists as well as the impact of different types of signals on gist detection quality.
|
| Dieser Eintrag ist Teil der Universitätsbibliographie. |
Suche Autoren in
BASE:
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura
Google Scholar:
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura
ORCID:
Weiland, Lydia, Hulpus, Ioana, Ponzetto, Simone Paolo ORCID: https://orcid.org/0000-0001-7484-2049 and Dietz, Laura
Sie haben einen Fehler gefunden? Teilen Sie uns Ihren Korrekturwunsch bitte hier mit: E-Mail
Actions (login required)
|
Eintrag anzeigen |
|
|