VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning
Alexandros Xenos, Niki Maria Foteinopoulou, Ioanna Ntinou, Ioannis Patras, Georgios Tzimiropoulos

TL;DR
This paper introduces a simple two-stage method leveraging Vision-and-Large-Language Models to generate natural language descriptions of emotions in context, which improves emotion recognition accuracy across multiple datasets.
Contribution
The work demonstrates that using VLLMs to produce emotion-related descriptions enhances in-context emotion classification, simplifying training and achieving state-of-the-art results.
Findings
Outperforms previous methods on BoLD, EMOTIC, and CAER-S datasets.
Textual descriptions help constrain noisy visual inputs.
Achieves state-of-the-art performance without complex training pipelines.
Abstract
Recognising emotions in context involves identifying an individual's apparent emotions while considering contextual cues from the surrounding scene. Previous approaches to this task have typically designed explicit scene-encoding architectures or incorporated external scene-related information, such as captions. However, these methods often utilise limited contextual information or rely on intricate training pipelines to decouple noise from relevant information. In this work, we leverage the capabilities of Vision-and-Large-Language Models (VLLMs) to enhance in-context emotion classification in a more straightforward manner. Our proposed method follows a simple yet effective two-stage approach. First, we prompt VLLMs to generate natural language descriptions of the subject's apparent emotion in relation to the visual context. Second, the descriptions, along with the visual input, are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Robotics and Automated Systems
