Robust Multimodal Sentiment Analysis of Image-Text Pairs by Distribution-Based Feature Recovery and Fusion
Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma

TL;DR
This paper introduces a novel distribution-based feature recovery and fusion method that enhances the robustness of multimodal sentiment analysis in image-text pairs, effectively handling low-quality and missing modalities in real-world social media data.
Contribution
It proposes a unified framework using feature queues and distribution estimation to recover and weigh modalities, improving sentiment prediction robustness against data quality issues.
Findings
DRF outperforms state-of-the-art methods on three datasets.
The method effectively handles low-quality and missing modalities.
Experimental results show significant robustness improvements.
Abstract
As posts on social media increase rapidly, analyzing the sentiments embedded in image-text pairs has become a popular research topic in recent years. Although existing works achieve impressive accomplishments in simultaneously harnessing image and text information, they lack the considerations of possible low-quality and missing modalities. In real-world applications, these issues might frequently occur, leading to urgent needs for models capable of predicting sentiment robustly. Therefore, we propose a Distribution-based feature Recovery and Fusion (DRF) method for robust multimodal sentiment analysis of image-text pairs. Specifically, we maintain a feature queue for each modality to approximate their feature distributions, through which we can simultaneously handle low-quality and missing modalities in a unified framework. For low-quality modalities, we reduce their contributions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications · Emotion and Mood Recognition
