DaQ-MSA: Denoising and Qualifying Diffusion Augmentations for Multimodal Sentiment Analysis
Jiazhang Liang, Jianheng Dai, Miaosen Luo, Menghua Jiang, Sijie Mai

TL;DR
DaQ-MSA enhances multimodal sentiment analysis by using diffusion models to generate augmented data, coupled with a quality scoring system to select high-fidelity samples, improving model robustness and generalization.
Contribution
This paper introduces a novel quality-aware augmentation method using diffusion models and a reliability scoring module for multimodal sentiment analysis.
Findings
Improved sentiment analysis accuracy with diffusion-based augmentation.
Effective filtering of low-quality augmented samples enhances training stability.
Demonstrated robustness without additional human annotations.
Abstract
Multimodal large language models (MLLMs) have demonstrated strong performance on vision-language tasks, yet their effectiveness on multimodal sentiment analysis remains constrained by the scarcity of high-quality training data, which limits accurate multimodal understanding and generalization. To alleviate this bottleneck, we leverage diffusion models to perform semantics-preserving augmentation on the video and audio modalities, expanding the multimodal training distribution. However, increasing data quantity alone is insufficient, as diffusion-generated samples exhibit substantial quality variation and noisy augmentations may degrade performance. We therefore propose DaQ-MSA (Denoising and Qualifying Diffusion Augmentations for Multimodal Sentiment Analysis), which introduces a quality scoring module to evaluate the reliability of augmented samples and assign adaptive training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining
