Bridging the Emotional Semantic Gap via Multimodal Relevance Estimation
Chuan Zhang, Daoxin Zhang, Ruixiu Zhang, Jiawei Li, Jianke Zhu

TL;DR
This paper introduces a multimodal relevance estimation network that uses attention, semantic estimation loss, and contrastive learning to effectively bridge the semantic gap among different emotional modalities, improving emotion recognition accuracy.
Contribution
It proposes a novel multimodal relevance estimation framework with a new dataset, SDME, to better capture relevant semantics across modalities in emotional analysis.
Findings
Effective in capturing relevant semantics despite large deviations
Improves multimodal emotion recognition accuracy
Provides a new dataset for semantic relevance research
Abstract
Human beings have rich ways of emotional expressions, including facial action, voice, and natural languages. Due to the diversity and complexity of different individuals, the emotions expressed by various modalities may be semantically irrelevant. Directly fusing information from different modalities may inevitably make the model subject to the noise from semantically irrelevant modalities. To tackle this problem, we propose a multimodal relevance estimation network to capture the relevant semantics among modalities in multimodal emotions. Specifically, we take advantage of an attention mechanism to reflect the semantic relevance weights of each modality. Moreover, we propose a relevant semantic estimation loss to weakly supervise the semantics of each modality. Furthermore, we make use of contrastive learning to optimize the similarity of category-level modality-relevant semantics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Multimodal Machine Learning Applications · Sentiment Analysis and Opinion Mining
MethodsContrastive Learning
