AlignMamba-2: Enhancing Multimodal Fusion and Sentiment Analysis with Modality-Aware Mamba
Yan Li, Yifei Xing, Xiangyuan Lan, Xin Li, Haifeng Chen, Dongmei Jiang

TL;DR
AlignMamba-2 introduces a novel, efficient multimodal fusion framework that employs dual alignment strategies and a modality-aware Mamba layer, significantly improving sentiment analysis and pattern recognition across diverse datasets.
Contribution
The paper presents AlignMamba-2, combining dual alignment regularization with a modality-aware Mamba layer to enhance multimodal fusion and sentiment analysis without additional inference costs.
Findings
Achieves state-of-the-art results on four benchmark datasets.
Demonstrates improved efficiency over Transformer-based models.
Effectively handles heterogeneity in multimodal data.
Abstract
In the era of large-scale pre-trained models, effectively adapting general knowledge to specific affective computing tasks remains a challenge, particularly regarding computational efficiency and multimodal heterogeneity. While Transformer-based methods have excelled at modeling inter-modal dependencies, their quadratic computational complexity limits their use with long-sequence data. Mamba-based models have emerged as a computationally efficient alternative; however, their inherent sequential scanning mechanism struggles to capture the global, non-sequential relationships that are crucial for effective cross-modal alignment. To address these limitations, we propose \textbf{AlignMamba-2}, an effective and efficient framework for multimodal fusion and sentiment analysis. Our approach introduces a dual alignment strategy that regularizes the model using both Optimal Transport distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications
