Presentation adaptation for multimodal interface systems: Three essays on the effectiveness of user-centric content and modality adaptation


Heck, Melanie


[img] PDF
Dissertation.pdf - Published

Download (4MB)

URN: urn:nbn:de:bsz:180-madoc-642882
Document Type: Doctoral dissertation
Year of publication: 2023
Place of publication: Mannheim
University: Universität Mannheim
Evaluator: Becker, Christian
Publication language: English
Institution: Business School > Wirtschaftsinformatik II (Becker 2006-2021)
Subject: 004 Computer science, internet
Keywords (English): multimodal interfaces , adaptation , user context , context acquisition
Abstract: The use of devices is becoming increasingly ubiquitous and the contexts of their users more and more dynamic. This often leads to situations where one communication channel is rather impractical. Text-based communication is particularly inconvenient when the hands are already occupied with another task. Audio messages induce privacy risks and may disturb other people if used in public spaces. Multimodal interfaces thus offer users the flexibility to choose between multiple interaction modalities. While the choice of a suitable input modality lies in the hands of the users, they may also require output in a different modality depending on their situation. To adapt the output of a system to a particular context, rules are needed that specify how information should be presented given the users’ situation and state. Therefore, this thesis tests three adaptation rules that – based on observations from cognitive science – have the potential to improve the interaction with an application by adapting the presented content or its modality. Following modality alignment, the output (audio versus visual) of a smart home display is matched with the user’s input (spoken versus manual) to the system. Experimental evaluations reveal that preferences for an input modality are initially too unstable to infer a clear preference for either interaction modality. Thus, the data shows no clear relation between the users’ modality choice for the first interaction and their attitude towards output in different modalities. To apply multimodal redundancy, information is displayed in multiple modalities. An application of the rule in a video conference reveals that captions can significantly reduce confusion. However, the effect is limited to confusion resulting from language barriers, whereas contradictory auditory reports leave the participants in a state of confusion independent of whether captions are available or not. We therefore suggest to activate captions only when the facial expression of a user – captured by action units, expressions of positive or negative affect, and a reduced blink rate – implies that the captions effectively improve comprehension. Content filtering in movies puts the character into the spotlight that – according to the distribution of their gaze to elements in the previous scene – the users prefer. If preferences are predicted with machine learning classifiers, this has the potential to significantly improve the user’ involvement compared to scenes of elements that the user does not prefer. Focused attention is additionally higher compared to scenes in which multiple characters take a lead role.




Dieser Eintrag ist Teil der Universitätsbibliographie.

Das Dokument wird vom Publikationsserver der Universitätsbibliothek Mannheim bereitgestellt.




Metadata export


Citation


+ Search Authors in

BASE: Heck, Melanie

Google Scholar: Heck, Melanie

ORCID: Heck, Melanie ORCID: 0000-0002-9601-0064

+ Download Statistics

Downloads per month over past year

View more statistics



You have found an error? Please let us know about your desired correction here: E-Mail


Actions (login required)

Show item Show item