Hierarchical MoE: Continuous Multimodal Emotion Recognition with Incomplete and Asynchronous Inputs
Yitong Zhu, Lei Han, Guanxuan Jiang, PengYuan Zhou, Yuyang Wang

TL;DR
This paper introduces Hi-MoE, a hierarchical mixture-of-experts framework that enhances continuous multimodal emotion recognition by effectively handling incomplete and asynchronous data, improving robustness and accuracy in real-world scenarios.
Contribution
The proposed Hi-MoE framework introduces a dual-layer expert structure with dynamic routing and cross-modal alignment, advancing robustness and adaptability in multimodal emotion recognition.
Findings
Achieves state-of-the-art performance on DEAP and DREAMER datasets.
Demonstrates robustness under modality incompleteness and asynchrony.
Outperforms existing methods in continuous emotion regression.
Abstract
Multimodal emotion recognition (MER) is crucial for human-computer interaction, yet real-world challenges like dynamic modality incompleteness and asynchrony severely limit its robustness. Existing methods often assume consistently complete data or lack dynamic adaptability. To address these limitations, we propose a novel Hi-MoE~(Hierarchical Mixture-of-Experts) framework for robust continuous emotion prediction. This framework employs a dual-layer expert structure. A Modality Expert Bank utilizes soft routing to dynamically handle missing modalities and achieve robust information fusion. A subsequent Emotion Expert Bank leverages differential-attention routing to flexibly attend to emotional prototypes, enabling fine-grained emotion representation. Additionally, a cross-modal alignment module explicitly addresses temporal shifts and semantic inconsistencies between modalities.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Mental Health via Writing
