MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis
Yingjie Zhou, Zicheng Zhang, Jiezhang Cao, Jun Jia, Yanwei Jiang,, Farong Wen, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai

TL;DR
MEMO-Bench introduces a comprehensive benchmark for evaluating the emotion recognition capabilities of Text-to-Image models and Multimodal Large Language Models, revealing current limitations in fine-grained emotion analysis.
Contribution
This work presents MEMO-Bench, a novel benchmark with a large dataset and a progressive evaluation framework for assessing AI models' human emotion understanding.
Findings
T2I models generate positive emotions more effectively.
MLLMs can recognize emotions but lack human-level accuracy.
Fine-grained emotion analysis remains challenging for current models.
Abstract
Artificial Intelligence (AI) has demonstrated significant capabilities in various fields, and in areas such as human-computer interaction (HCI), embodied intelligence, and the design and animation of virtual digital humans, both practitioners and users are increasingly concerned with AI's ability to understand and express emotion. Consequently, the question of whether AI can accurately interpret human emotions remains a critical challenge. To date, two primary classes of AI models have been involved in human emotion analysis: generative models and Multimodal Large Language Models (MLLMs). To assess the emotional capabilities of these two classes of models, this study introduces MEMO-Bench, a comprehensive benchmark consisting of 7,145 portraits, each depicting one of six different emotions, generated by 12 Text-to-Image (T2I) models. Unlike previous works, MEMO-Bench provides a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining
