Provable Contrastive Continual Learning
Yichen Wen, Zhiquan Tan, Kaipeng Zheng, Chuanlong Xie, Weiran Huang

TL;DR
This paper provides theoretical guarantees for contrastive continual learning, introduces a novel algorithm CILA with adaptive distillation, and demonstrates state-of-the-art results on benchmarks.
Contribution
It offers the first theoretical analysis of contrastive continual learning and proposes a new algorithm with adaptive coefficients for improved performance.
Findings
Theoretical performance bounds relate to training losses of previous tasks.
CILA outperforms existing methods on standard benchmarks.
Pre-training enhances continual learning effectiveness.
Abstract
Continual learning requires learning incremental tasks with dynamic data distributions. So far, it has been observed that employing a combination of contrastive loss and distillation loss for training in continual learning yields strong performance. To the best of our knowledge, however, this contrastive continual learning framework lacks convincing theoretical explanations. In this work, we fill this gap by establishing theoretical performance guarantees, which reveal how the performance of the model is bounded by training losses of previous tasks in the contrastive continual learning framework. Our theoretical explanations further support the idea that pre-training can benefit continual learning. Inspired by our theoretical analysis of these guarantees, we propose a novel contrastive continual learning algorithm called CILA, which uses adaptive distillation coefficients for different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
