Caption, Create, Continue: Continual Learning with Pre-trained Generative Vision-Language Models

Indu Solomon; Aye Phyu Phyu Aung; Uttam Kumar; Senthilnath Jayavelu

arXiv:2409.17806·cs.LG·November 14, 2025

Caption, Create, Continue: Continual Learning with Pre-trained Generative Vision-Language Models

Indu Solomon, Aye Phyu Phyu Aung, Uttam Kumar, Senthilnath Jayavelu

PDF

Open Access

TL;DR

This paper introduces CLTS, a continual learning framework that uses pre-trained vision-language models to mitigate forgetting without storing real data, achieving high accuracy and memory efficiency.

Contribution

CLTS is a novel continual learning approach that leverages generative models and task routing to reduce data storage needs and improve performance.

Findings

01

Up to 54% improvement in average task accuracy

02

63 times better memory efficiency than recent baselines

03

Effective handling of class-incremental tasks without real data storage

Abstract

Continual learning (CL) enables models to adapt to evolving data streams without catastrophic forgetting, a fundamental requirement for real-world AI systems. However, the current methods often depend on large replay buffers or heavily annotated datasets which are impractical due to storage, privacy, and cost constraints. We propose CLTS (Continual Learning via Text-Image Synergy), a novel class-incremental framework that mitigates forgetting without storing real task data. CLTS leverages pre-trained vision-language models, BLIP (Bootstrapping Language-Image Pre-training) for caption generation and stable diffusion for sample generation. Each task is handled by a dedicated Task Head, while a Task Router learns to assign inputs to the correct Task Head using the generated data. On three benchmark datasets, CLTS improves average task accuracy by up to 54% and achieves 63 times better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProblem and Project Based Learning · Intelligent Tutoring Systems and Adaptive Learning

MethodsDiffusion