CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss

Rakshith Sharma Srinivasa; Jaejin Cho; Chouchang Yang; Yashas Malur; Saidutta; Ching-Hua Lee; Yilin Shen; Hongxia Jin

arXiv:2309.14580·cs.LG·September 27, 2023·5 cites

CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss

Rakshith Sharma Srinivasa, Jaejin Cho, Chouchang Yang, Yashas Malur, Saidutta, Ching-Hua Lee, Yilin Shen, Hongxia Jin

PDF

Open Access 1 Video

TL;DR

This paper introduces CWCL, a novel continuous similarity-based contrastive loss for cross-modal zero-shot transfer, improving alignment and performance across image-text and speech-text tasks over existing methods.

Contribution

The paper proposes the Continuously Weighted Contrastive Loss (CWCL), a new loss function that models similarity as a continuous measure, enhancing cross-modal representation alignment.

Findings

01

Achieves 5-8% improvement in zero-shot image classification.

02

Achieves 20-30% improvement in zero-shot speech-to-intent classification.

03

Outperforms existing methods across multiple models, datasets, and modalities.

Abstract

This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that have recently gained considerable attention. Most existing works for cross-modal representation alignment (including CLIP and LiT) use the standard contrastive training objective, which employs sets of positive and negative examples to align similar and repel dissimilar training data samples. However, similarity amongst training examples has a more continuous nature, thus calling for a more `non-binary' treatment. To address this, we propose a novel loss function called Continuously Weighted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss· slideslive

Taxonomy

TopicsMultimodal Machine Learning Applications · Cancer-related molecular mechanisms research · Domain Adaptation and Few-Shot Learning

MethodsALIGN · Contrastive Language-Image Pre-training