EmoTrans: A Benchmark for Understanding, Reasoning, and Predicting Emotion Transitions in Multimodal LLMs

He Hu; Tengjin Weng; Zebang Cheng; Yu Wang; Jiachen Luo; Bj\"orn Schuller; Zheng Lian; Laizhong Cui

arXiv:2604.23348·cs.CV·April 28, 2026

EmoTrans: A Benchmark for Understanding, Reasoning, and Predicting Emotion Transitions in Multimodal LLMs

He Hu, Tengjin Weng, Zebang Cheng, Yu Wang, Jiachen Luo, Bj\"orn Schuller, Zheng Lian, Laizhong Cui

PDF

1 Repo

TL;DR

EmoTrans is a comprehensive benchmark designed to evaluate multimodal large language models' ability to understand, reason about, and predict emotion transitions in dynamic social video scenarios.

Contribution

The paper introduces EmoTrans, a new benchmark with annotated videos and QA pairs to assess emotion dynamics understanding in multimodal models, covering four progressive tasks.

Findings

01

Current models perform well on coarse emotion change detection but struggle with fine-grained dynamics.

02

Multi-person social scenarios are particularly challenging for existing models.

03

Reasoning tasks do not always improve model performance significantly.

Abstract

Recent multimodal large language models (MLLMs) have shown strong capabilities in perception, reasoning, and generation, and are increasingly used in applications such as social robots and human-computer interaction, where understanding human emotions is essential. However, existing benchmarks mainly formulate emotion understanding as a static recognition problem, leaving it largely unclear whether current MLLMs can understand emotion as a dynamic process that evolves, shifts between states, and unfolds across diverse social contexts. To bridge this gap, we present EmoTrans, a benchmark for evaluating emotion dynamics understanding in multimodal videos. EmoTrans contains 1,000 carefully collected and manually annotated video clips, covering 12 real-world scenarios, and further provides over 3,000 task-specific question-answer (QA) pairs for fine-grained evaluation. The benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Emo-gml/EmoTrans
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.