"I'm Not Mad, Just Focused'': Understanding Human Emotions in Human-Robot Collaboration

Seung Chan Hong; Dana Kuli\'c; Leimin Tian

arXiv:2605.16816·cs.RO·May 19, 2026

"I'm Not Mad, Just Focused'': Understanding Human Emotions in Human-Robot Collaboration

Seung Chan Hong, Dana Kuli\'c, Leimin Tian

PDF

TL;DR

This paper introduces a vision language model-based emotion recognition system for human-robot collaboration, demonstrating improved accuracy and user preference for emotion-adaptive robot behavior.

Contribution

The paper presents a novel VLM-based ER system that outperforms traditional models and enhances robot adaptability in collaborative tasks.

Findings

01

VLM-ER achieves higher semantic similarity with human annotations.

02

Participants preferred emotion-adaptive robot behavior.

03

VLM-ER improves sentiment alignment over baseline models.

Abstract

Human-robot collaboration (HRC) can benefit from robots' abilities to interpret human emotional states. However, current emotion recognition (ER) models in HRC often fall short, particularly due to their reliance on acted datasets and single-modality inputs like facial expressions. We propose a novel vision language model (VLM)-based ER system that leverages contextual understanding to improve emotion interpretation in HRC. We first evaluate the VLM-ER system by assessing its semantic and sentiment similarity with human annotations on an existing HRC dataset. Then, in a user study with a service robot in a collaborative delivery task, we evaluate the effects of modulating the robot's behaviour based on the user's emotional state inferred by the VLM-ER system. The results show that the proposed VLM-ER system achieves higher semantic similarity and positive sentiment alignment with human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.