Conversational Emotion Analysis via Attention Mechanisms
Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang

TL;DR
This paper introduces a multimodal framework for conversational emotion analysis that leverages attention mechanisms and speaker embeddings to improve emotion recognition accuracy in dialogues.
Contribution
It proposes a novel approach combining relation modeling, attention-based fusion, and speaker embeddings for enhanced conversational emotion analysis.
Findings
Achieved 2.42% performance improvement over state-of-the-art methods.
Effectively captures long-term contextual information in dialogues.
Utilizes multimodal features for more accurate emotion recognition.
Abstract
Different from the emotion recognition in individual utterances, we propose a multimodal learning framework using relation and dependencies among the utterances for conversational emotion analysis. The attention mechanism is applied to the fusion of the acoustic and lexical features. Then these fusion representations are fed into the self-attention based bi-directional gated recurrent unit (GRU) layer to capture long-term contextual information. To imitate real interaction patterns of different speakers, speaker embeddings are also utilized as additional inputs to distinguish the speaker identities during conversational dialogs. To verify the effectiveness of the proposed method, we conduct experiments on the IEMOCAP database. Experimental results demonstrate that our method shows absolute 2.42% performance improvement over the state-of-the-art strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
