M2R2: Missing-Modality Robust emotion Recognition framework with iterative data augmentation
Ning Wang

TL;DR
This paper introduces M2R2, a framework for emotion recognition in conversations that is robust to missing modalities, using iterative data augmentation and a novel network to improve accuracy despite incomplete data.
Contribution
The paper proposes a new framework M2R2 with a Party Attentive Network and adversarial data augmentation to handle modality missing in emotion recognition tasks.
Findings
M2R2 outperforms baseline models on two datasets.
Iterative data augmentation improves robustness to missing modalities.
Attention mechanisms effectively reduce dependence on complete multi-party data.
Abstract
This paper deals with the utterance-level modalities missing problem with uncertain patterns on emotion recognition in conversation (ERC) task. Present models generally predict the speaker's emotions by its current utterance and context, which is degraded by modality missing considerably. Our work proposes a framework Missing-Modality Robust emotion Recognition (M2R2), which trains emotion recognition model with iterative data augmentation by learned common representation. Firstly, a network called Party Attentive Network (PANet) is designed to classify emotions, which tracks all the speakers' states and context. Attention mechanism between speaker with other participants and dialogue topic is used to decentralize dependence on multi-time and multi-party utterances instead of the possible incomplete one. Moreover, the Common Representation Learning (CRL) problem is defined for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining
