FINE: Factorized multimodal sentiment analysis via mutual INformation Estimation
Yadong Liu, Shangfei Wang

TL;DR
This paper introduces FINE, a novel factorized multimodal fusion framework that disentangles shared and unique modality representations using mutual information estimation, enhancing sentiment analysis accuracy and robustness.
Contribution
The paper proposes a new factorization approach guided by mutual information, along with auxiliary modules for improved feature extraction and temporal modeling in multimodal sentiment analysis.
Findings
Outperforms existing methods on multiple datasets
Enhances representation quality by reducing redundancy
Improves class-level separability through contrastive learning
Abstract
Multimodal sentiment analysis remains a challenging task due to the inherent heterogeneity across modalities. Such heterogeneity often manifests as asynchronous signals, imbalanced information between modalities, and interference from task-irrelevant noise, hindering the learning of robust and accurate sentiment representations. To address these issues, we propose a factorized multimodal fusion framework that first disentangles each modality into shared and unique representations, and then suppresses task-irrelevant noise within both to retain only sentiment-critical representations. This fine-grained decomposition improves representation quality by reducing redundancy, prompting cross-modal complementarity, and isolating task-relevant sentiment cues. Rather than manipulating the feature space directly, we adopt a mutual information-based optimization strategy to guide the factorization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition · Multimodal Machine Learning Applications
