LoRA-Loop: Closing the Synthetic Replay Cycle for Continual VLM Learning
Kaihong Wang, Donghyun Kim, Margrit Betke

TL;DR
This paper introduces LoRA-Loop, a novel synthetic replay method for continual vision-language model learning that adapts a frozen generator with low-rank adapters, improving sample relevance and knowledge retention.
Contribution
It proposes a LoRA-based generator adaptation and confidence-based sample selection to enhance synthetic replay in continual VLM learning.
Findings
Outperforms previous synthetic-replay methods on MTIL benchmark.
Achieves better balance among plasticity, stability, and zero-shot capability.
Demonstrates robustness in real-world domain-specific scenarios.
Abstract
Continual learning for vision-language models has achieved remarkable performance through synthetic replay, where samples are generated using Stable Diffusion to regularize during finetuning and retain knowledge. However, real-world downstream applications often exhibit domain-specific nuances and fine-grained semantics not captured by generators, causing synthetic-replay methods to produce misaligned samples that misguide finetuning and undermine retention of prior knowledge. In this work, we propose a LoRA-enhanced synthetic-replay framework that injects task-specific low-rank adapters into a frozen Stable Diffusion model, efficiently capturing each new task's unique visual and semantic patterns. Specifically, we introduce a two-stage, confidence-based sample selection: we first rank real task data by post-finetuning VLM confidence to focus LoRA finetuning on the most representative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
