Leveraging Self-Paced Curriculum Learning for Enhanced Modality Balance in Multimodal Conversational Emotion Recognition
Phuong-Anh Nguyen, The-Son Le, Duc-Trong Le, Cam-Van Thi Nguyen

TL;DR
This paper introduces a Self-Paced Curriculum Learning framework to improve modality balance and robustness in Multimodal Emotion Recognition in Conversations, demonstrating consistent performance gains across datasets.
Contribution
It proposes a dual-level Difficulty Measurer and Learning Scheduler to dynamically guide training, addressing modality imbalance in MERC.
Findings
SPCL improves weighted F1-score by up to +6.6% on IEMOCAP.
SPCL achieves up to +10.4% gains on MELD.
The method enhances robustness across different architectures.
Abstract
Multimodal Emotion Recognition in Conversations (MERC) is a crucial task for understanding human interactions, where multimodal approaches integrating language, facial expressions, and vocal tone have achieved significant progress. However, modality misalignment and imbalanced learning remain major challenges, limiting the effective utilization of multimodal information. To address this issue, we propose a plug-and-play framework based on Self-Paced Curriculum Learning (SPCL) for MERC. We introduce a dual-level Difficulty Measurer that captures both utterance-level and conversation-level challenges. The utterance-level score models fine-grained modality-specific difficulty, while the conversation-level score captures broader dialogue structures, including emotional dependencies and modality coherence. Based on these scores, the Learning Scheduler dynamically guides training from easier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
