CLIP model is an Efficient Online Lifelong Learner

Leyuan Wang; Liuyu Xiang; Yujie Wei; Yunlong Wang; Zhaofeng He

arXiv:2405.15155·cs.CV·May 27, 2024

CLIP model is an Efficient Online Lifelong Learner

Leyuan Wang, Liuyu Xiang, Yujie Wei, Yunlong Wang, Zhaofeng He

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that CLIP, a vision-language model, can be effectively adapted for online lifelong learning by introducing a symmetric image-text tuning strategy, improving continual learning performance in dynamic data environments.

Contribution

The authors propose the Symmetric Image-Text (SIT) tuning method for CLIP, enabling efficient online lifelong learning without preset class or memory constraints.

Findings

01

SIT improves CLIP's performance in lifelong learning benchmarks.

02

Maintaining symmetry between image and text is crucial during tuning.

03

Tuning the image encoder benefits lifelong learning, while tuning the text encoder enhances zero-shot capabilities.

Abstract

Online Lifelong Learning (OLL) addresses the challenge of learning from continuous and non-stationary data streams. Existing online lifelong learning methods based on image classification models often require preset conditions such as the total number of classes or maximum memory capacity, which hinders the realization of real never-ending learning and renders them impractical for real-world scenarios. In this work, we propose that vision-language models, such as Contrastive Language-Image Pretraining (CLIP), are more suitable candidates for online lifelong learning. We discover that maintaining symmetry between image and text is crucial during Parameter-Efficient Tuning (PET) for CLIP model in online lifelong learning. To this end, we introduce the Symmetric Image-Text (SIT) tuning strategy. We conduct extensive experiments on multiple lifelong learning benchmark datasets and elucidate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Debatrix/LifeLong-CLIP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOnline and Blended Learning

MethodsContrastive Language-Image Pre-training