Addressing Data Scarcity in Multimodal User State Recognition by   Combining Semi-Supervised and Supervised Learning

Hendric Vo{\ss}; Heiko Wersing; Stefan Kopp

arXiv:2202.03775·cs.CV·February 9, 2022

Addressing Data Scarcity in Multimodal User State Recognition by Combining Semi-Supervised and Supervised Learning

Hendric Vo{\ss}, Heiko Wersing, Stefan Kopp

PDF

TL;DR

This paper introduces a multimodal machine learning method that effectively detects user disagreement and confusion in human-robot interactions using limited labeled data by combining semi-supervised and supervised learning, enhancing robustness.

Contribution

It presents a novel approach that combines semi-supervised and supervised learning for user state recognition with minimal labeled data in multimodal settings.

Findings

01

Achieved 81.1% F1-score in dis-/agreement detection.

02

Demonstrated improved robustness over purely supervised methods.

03

Utilized a new preprocessing pipeline for multimodal data.

Abstract

Detecting mental states of human users is crucial for the development of cooperative and intelligent robots, as it enables the robot to understand the user's intentions and desires. Despite their importance, it is difficult to obtain a large amount of high quality data for training automatic recognition algorithms as the time and effort required to collect and label such data is prohibitively high. In this paper we present a multimodal machine learning approach for detecting dis-/agreement and confusion states in a human-robot interaction environment, using just a small amount of manually annotated data. We collect a data set by conducting a human-robot interaction study and develop a novel preprocessing pipeline for our machine learning approach. By combining semi-supervised and supervised architectures, we are able to achieve an average F1-score of 81.1\% for dis-/agreement detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.