"I'm Not Mad, Just Focused'': Understanding Human Emotions in Human-Robot Collaboration
Seung Chan Hong, Dana Kuli\'c, Leimin Tian

TL;DR
This paper introduces a vision language model-based emotion recognition system for human-robot collaboration, demonstrating improved accuracy and user preference for emotion-adaptive robot behavior.
Contribution
The paper presents a novel VLM-based ER system that outperforms traditional models and enhances robot adaptability in collaborative tasks.
Findings
VLM-ER achieves higher semantic similarity with human annotations.
Participants preferred emotion-adaptive robot behavior.
VLM-ER improves sentiment alignment over baseline models.
Abstract
Human-robot collaboration (HRC) can benefit from robots' abilities to interpret human emotional states. However, current emotion recognition (ER) models in HRC often fall short, particularly due to their reliance on acted datasets and single-modality inputs like facial expressions. We propose a novel vision language model (VLM)-based ER system that leverages contextual understanding to improve emotion interpretation in HRC. We first evaluate the VLM-ER system by assessing its semantic and sentiment similarity with human annotations on an existing HRC dataset. Then, in a user study with a service robot in a collaborative delivery task, we evaluate the effects of modulating the robot's behaviour based on the user's emotional state inferred by the VLM-ER system. The results show that the proposed VLM-ER system achieves higher semantic similarity and positive sentiment alignment with human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
