Unified Contrastive Learning in Image-Text-Label Space
Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Bin Xiao, Ce Liu, Lu Yuan,, Jianfeng Gao

TL;DR
This paper introduces Unified Contrastive Learning (UniCL), a novel approach that combines supervised image-label data and language-image contrastive learning into a single framework, enhancing zero-shot and supervised recognition performance.
Contribution
The work proposes a new unified formulation and learning paradigm that effectively integrates different data sources into a common space for improved image recognition.
Findings
Achieves up to 9.2% and 14.5% gains in zero-shot recognition benchmarks.
Boosts linear probe performance by 7.3% and 3.4%.
Rivals supervised learning methods on image-label datasets.
Abstract
Visual recognition is recently learned via either supervised learning on human-annotated image-label data or language-image contrastive learning with webly-crawled image-text pairs. While supervised learning may result in a more discriminative representation, language-image pretraining shows unprecedented zero-shot recognition capability, largely due to the different properties of data sources and learning objectives. In this work, we introduce a new formulation by combining the two data sources into a common image-text-label space. In this space, we propose a new learning paradigm, called Unified Contrastive Learning (UniCL) with a single learning objective to seamlessly prompt the synergy of two data types. Extensive experiments show that our UniCL is an effective way of learning semantically rich yet discriminative representations, universally for image recognition in zero-shot,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research
MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Contrastive Learning · Average Pooling · Dropout · Stochastic Depth · Absolute Position Encodings · 1x1 Convolution · Label Smoothing
