Using object detection, NLP, and knowledge bases to understand the message of images
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura

DOI:
|
https://doi.org/10.1007/978-3-319-51814-5_34
|
URL:
|
http://link.springer.com/chapter/10.1007/978-3-319...
|
Additional URL:
|
https://www.springerprofessional.de/using-object-d...
|
Document Type:
|
Conference or workshop publication
|
Year of publication:
|
2017
|
Book title:
|
MultiMedia Modeling : 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part II
|
The title of a journal, publication series:
|
Lecture Notes in Computer Science
|
Volume:
|
10133
|
Page range:
|
405-418
|
Conference title:
|
23rd International Conference, MMM 2017
|
Location of the conference venue:
|
Reykjavik, Iceland
|
Date of the conference:
|
04-07 January 2017
|
Publisher:
|
Amsaleg, Laurent
|
Place of publication:
|
Berlin [u.a.]
|
Publishing house:
|
Springer
|
ISBN:
|
978-3-319-51813-8 , 978-3-319-51814-5
|
ISSN:
|
0302-9743 , 1611-3349
|
Publication language:
|
English
|
Institution:
|
School of Business Informatics and Mathematics > Information Systems III: Enterprise Data Analysis (Ponzetto 2016-)
|
Subject:
|
004 Computer science, internet
|
Abstract:
|
With the increasing amount of multimodal content from social media
posts and news articles, there has been an intensified effort towards conceptual
labeling and multimodal (topic) modeling of images and of their affiliated texts.
Nonetheless, the problem of identifying and automatically naming the core abstract
message (gist) behind images has received less attention. This problem is
especially relevant for the semantic indexing and subsequent retrieval of images.
In this paper, we propose a solution that makes use of external knowledge bases
such as Wikipedia and DBpedia. Its aim is to leverage complex semantic associations
between the image objects and the textual caption in order to uncover
the intended gist. The results of our evaluation prove the ability of our proposed
approach to detect gist with a best MAP score of 0.74 when assessed against
human annotations. Furthermore, an automatic image tagging and caption generation
API is compared to manually set image and caption signals. We show and
discuss the difficulty to find the correct gist especially for abstract, non-depictable
gists as well as the impact of different types of signals on gist detection quality.
|
 | Dieser Eintrag ist Teil der Universitätsbibliographie. |
Search Authors in
BASE:
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura
Google Scholar:
Weiland, Lydia
;
Hulpus, Ioana
;
Ponzetto, Simone Paolo
;
Dietz, Laura
ORCID:
Weiland, Lydia, Hulpus, Ioana, Ponzetto, Simone Paolo ORCID: https://orcid.org/0000-0001-7484-2049 and Dietz, Laura
You have found an error? Please let us know about your desired correction here: E-Mail
Actions (login required)
 |
Show item |
|