LMR-CBT: Learning Modality-fused Representations with CB-Transformer for Multimodal Emotion Recognition from Unaligned Multimodal Sequences
Ziwang Fu, Feng Liu, Hanyang Wang, Siyuan Shen, Jiahao Zhang, Jiayin, Qi, Xiangling Fu, Aimin Zhou

TL;DR
This paper introduces LMR-CBT, a novel neural network that effectively fuses multimodal features using CB-Transformer for emotion recognition from unaligned sequences, outperforming existing methods in accuracy and efficiency.
Contribution
The paper proposes a new CB-Transformer based model that enhances multimodal feature fusion and learning from unaligned sequences, achieving state-of-the-art results.
Findings
Outperforms existing methods on IEMOCAP, CMU-MOSI, and CMU-MOSEI datasets.
Achieves state-of-the-art accuracy with fewer parameters.
Demonstrates superior efficiency in processing unaligned multimodal sequences.
Abstract
Learning modality-fused representations and processing unaligned multimodal sequences are meaningful and challenging in multimodal emotion recognition. Existing approaches use directional pairwise attention or a message hub to fuse language, visual, and audio modalities. However, those approaches introduce information redundancy when fusing features and are inefficient without considering the complementarity of modalities. In this paper, we propose an efficient neural network to learn modality-fused representations with CB-Transformer (LMR-CBT) for multimodal emotion recognition from unaligned multimodal sequences. Specifically, we first perform feature extraction for the three modalities respectively to obtain the local structure of the sequences. Then, we design a novel transformer with cross-modal blocks (CB-Transformer) that enables complementary learning of different modalities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Advanced Computing and Algorithms
