MSAF: Multimodal Split Attention Fusion
Lang Su, Chuqing Hu, Guofa Li, Dongpu Cao

TL;DR
This paper introduces MSAF, a novel multimodal fusion module that emphasizes important features across modalities, compatible with various neural network architectures, and demonstrates improved performance in emotion, sentiment, and action recognition tasks.
Contribution
The paper presents MSAF, a flexible and effective multimodal fusion module that can be integrated into different neural networks and leverages pretrained models for enhanced multimodal learning.
Findings
MSAF improves accuracy in emotion recognition.
MSAF outperforms existing multimodal fusion methods.
The module is compatible with CNNs and RNNs.
Abstract
Multimodal learning mimics the reasoning process of the human multi-sensory system, which is used to perceive the surrounding world. While making a prediction, the human brain tends to relate crucial cues from multiple sources of information. In this work, we propose a novel multimodal fusion module that learns to emphasize more contributive features across all modalities. Specifically, the proposed Multimodal Split Attention Fusion (MSAF) module splits each modality into channel-wise equal feature blocks and creates a joint representation that is used to generate soft attention for each channel across the feature blocks. Further, the MSAF module is designed to be compatible with features of various spatial dimensions and sequence lengths, suitable for both CNNs and RNNs. Thus, MSAF can be easily added to fuse features of any unimodal networks and utilize existing pretrained unimodal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Human Pose and Action Recognition · Multimodal Machine Learning Applications
MethodsAverage Pooling · Softmax · Batch Normalization · Residual Connection · Global Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · guidence~How to file a complaint against Expedia? · Dense Connections · Split Attention
