MoLAN: A Unified Modality-Aware Noise Dynamic Editing Framework for Multimodal Sentiment Analysis
Xingle Xu, Yongkang Liu, Dexian Cai, Shi Feng, Xiaocui Yang, Daling Wang, Yifei Zhang

TL;DR
MoLAN introduces a flexible, modality-aware noise suppression framework for multimodal sentiment analysis, improving the preservation of critical information and achieving state-of-the-art results across multiple models and datasets.
Contribution
The paper presents MoLAN, a novel framework that dynamically assigns denoising strengths to feature blocks based on noise levels, enhancing multimodal sentiment analysis.
Findings
MoLAN improves sentiment prediction accuracy across diverse models.
MoLAN+ achieves state-of-the-art performance on multiple datasets.
The framework is adaptable and can be integrated into various multimodal models.
Abstract
Multimodal Sentiment Analysis aims to integrate information from various modalities, such as audio, visual, and text, to make complementary predictions. However, it often struggles with irrelevant or misleading visual and auditory information. Most existing approaches typically treat the entire modality information (e.g., a whole image, audio segment, or text paragraph) as an independent unit for feature enhancement or denoising. They often suppress the redundant and noise information at the risk of losing critical information. To address this challenge, we propose MoLAN, a unified ModaLity-aware noise dynAmic editiNg framework. Specifically, MoLAN performs modality-aware blocking by dividing the features of each modality into multiple blocks. Each block is then dynamically assigned a distinct denoising strength based on its noise level and semantic relevance, enabling fine-grained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
