EmoCaps: Emotion Capsule based Model for Conversational Emotion Recognition
Zaijing Li, Fengxiao Tang, Ming Zhao, Yusen Zhu

TL;DR
This paper introduces EmoCaps, a novel multi-modal emotion recognition model that effectively captures emotional tendencies in conversations, outperforming existing models on benchmark datasets.
Contribution
It proposes Emoformer for multi-modal emotion vector extraction and integrates it into EmoCaps for improved conversational emotion recognition.
Findings
Outperforms state-of-the-art models on benchmark datasets
Effectively captures multi-modal emotional tendencies
Provides an end-to-end framework for ERC
Abstract
Emotion recognition in conversation (ERC) aims to analyze the speaker's state and identify their emotion in the conversation. Recent works in ERC focus on context modeling but ignore the representation of contextual emotional tendency. In order to extract multi-modal information and the emotional tendency of the utterance effectively, we propose a new structure named Emoformer to extract multi-modal emotion vectors from different modalities and fuse them with sentence vector to be an emotion capsule. Furthermore, we design an end-to-end ERC model called EmoCaps, which extracts emotion vectors through the Emoformer structure and obtain the emotion classification results from a context analysis model. Through the experiments with two benchmark datasets, our model shows better performance than the existing state-of-the-art models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition
