CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis
Yangmin Li, Ruiqi Zhu, Wengen Li

TL;DR
CorMulT introduces a semi-supervised multimodal transformer that leverages learned modality correlation coefficients to improve sentiment analysis, especially in cases of weak modality correlations, outperforming existing methods.
Contribution
The paper presents a novel two-stage semi-supervised model that explicitly learns and utilizes modality correlation coefficients for enhanced multimodal sentiment analysis.
Findings
CorMulT surpasses state-of-the-art methods on CMU-MOSEI dataset.
The correlation contrastive learning module effectively captures modality relationships.
Fusing learned correlation coefficients improves sentiment prediction accuracy.
Abstract
Multimodal sentiment analysis is an active research area that combines multiple data modalities, e.g., text, image and audio, to analyze human emotions and benefits a variety of applications. Existing multimodal sentiment analysis methods can be classified as modality interaction-based methods, modality transformation-based methods and modality similarity-based methods. However, most of these methods highly rely on the strong correlations between modalities, and cannot fully uncover and utilize the correlations between modalities to enhance sentiment analysis. Therefore, these methods usually achieve bad performance for identifying the sentiment of multimodal data with weak correlations. To address this issue, we proposed a two-stage semi-supervised model termed Correlation-aware Multimodal Transformer (CorMulT) which consists pre-training stage and prediction stage. At the pre-training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Advanced Text Analysis Techniques
MethodsLinear Layer · Multi-Head Attention · Attention Is All You Need · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam
