Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text Pairs

Daiqing Wu; Dongbao Yang; Yu Zhou; Can Ma

arXiv:2511.17103·cs.CV·November 24, 2025

Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text Pairs

Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma

PDF

Open Access

TL;DR

This paper introduces a novel method called Partitioned Adaptive Contrastive Learning (PACL) that leverages textual knowledge from noisy social media data to bridge the affective gap in visual emotion recognition, significantly enhancing model performance.

Contribution

It proposes a new contrastive learning framework that separates sample types and exploits noisy image-text pairs to improve visual emotion recognition models.

Findings

01

Bridging the affective gap improves emotion recognition accuracy.

02

PACL outperforms existing methods on multiple downstream tasks.

03

Dynamic construction of positive and negative pairs enhances learning from noisy data.

Abstract

Visual emotion recognition (VER) is a longstanding field that has garnered increasing attention with the advancement of deep neural networks. Although recent studies have achieved notable improvements by leveraging the knowledge embedded within pre-trained visual models, the lack of direct association between factual-level features and emotional categories, called the "affective gap", limits the applicability of pre-training knowledge for VER tasks. On the contrary, the explicit emotional expression and high information density in textual modality eliminate the "affective gap". Therefore, we propose borrowing the knowledge from the pre-trained textual model to enhance the emotional perception of pre-trained visual models. We focus on the factual and emotional connections between images and texts in noisy social media data, and propose Partitioned Adaptive Contrastive Learning (PACL) to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications