Speaker-Guided Encoder-Decoder Framework for Emotion Recognition in Conversation
Yinan Bao, Qianwen Ma, Lingwei Wei, Wei Zhou, Songlin Hu

TL;DR
This paper introduces a novel speaker-guided encoder-decoder framework for emotion recognition in conversations, effectively modeling dynamic intra- and inter-speaker dependencies to improve emotion prediction accuracy.
Contribution
The paper proposes a flexible, scalable framework that jointly explores speaker dependencies and leverages speaker information for emotion decoding in conversations.
Findings
SGED outperforms existing methods in emotion recognition accuracy.
The framework demonstrates high scalability with different context encoders.
Experimental results confirm the effectiveness of joint speaker dependency modeling.
Abstract
The emotion recognition in conversation (ERC) task aims to predict the emotion label of an utterance in a conversation. Since the dependencies between speakers are complex and dynamic, which consist of intra- and inter-speaker dependencies, the modeling of speaker-specific information is a vital role in ERC. Although existing researchers have proposed various methods of speaker interaction modeling, they cannot explore dynamic intra- and inter-speaker dependencies jointly, leading to the insufficient comprehension of context and further hindering emotion prediction. To this end, we design a novel speaker modeling scheme that explores intra- and inter-speaker dependencies jointly in a dynamic manner. Besides, we propose a Speaker-Guided Encoder-Decoder (SGED) framework for ERC, which fully exploits speaker information for the decoding of emotion. We use different existing methods as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Emotion and Mood Recognition · Speech and dialogue systems
