Advancing Cross-domain Discriminability in Continual Learning of   Vision-Language Models

Yicheng Xu; Yuxin Chen; Jiahao Nie; Yusong Wang; Huiping Zhuang,; Manabu Okumura

arXiv:2406.18868·cs.CV·December 19, 2024

Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models

Yicheng Xu, Yuxin Chen, Jiahao Nie, Yusong Wang, Huiping Zhuang,, Manabu Okumura

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces RAIL, a novel continual learning method for vision-language models that preserves zero-shot capabilities across domains without extra data, and proposes the new X-TAIL setting for domain-agnostic incremental learning.

Contribution

The paper presents RAIL, a recursive ridge regression adapter that decouples cross-domain correlations and maintains zero-shot abilities without reference data, and introduces the X-TAIL setting for domain-agnostic continual learning.

Findings

01

RAIL achieves state-of-the-art results in X-TAIL and multi-domain task-incremental learning.

02

RAIL preserves zero-shot abilities on unseen domains without reference datasets.

03

Theoretical proof of RAIL's absolute memorization on learned domains.

Abstract

Continual learning (CL) with Vision-Language Models (VLMs) has overcome the constraints of traditional CL, which only focuses on previously encountered classes. During the CL of VLMs, we need not only to prevent the catastrophic forgetting on incrementally learned knowledge but also to preserve the zero-shot ability of VLMs. However, existing methods require additional reference datasets to maintain such zero-shot ability and rely on domain-identity hints to classify images across different domains. In this study, we propose Regression-based Analytic Incremental Learning (RAIL), which utilizes a recursive ridge regression-based adapter to learn from a sequence of domains in a non-forgetting manner and decouple the cross-domain correlations by projecting features to a higher-dimensional space. Cooperating with a training-free fusion module, RAIL absolutely preserves the VLM's zero-shot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

linghan1997/regression-based-analytic-incremental-learning
pytorchOfficial

Videos

Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsAdapter