C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning

Wei Li; Hangjie Yuan; Zixiang Zhao; Yifan Zhu; Aojun Lu; Tao Feng; Yanan Sun

arXiv:2508.18860·cs.LG·September 1, 2025

C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning

Wei Li, Hangjie Yuan, Zixiang Zhao, Yifan Zhu, Aojun Lu, Tao Feng, Yanan Sun

PDF

TL;DR

C-Flat++ introduces a flexible framework for continual learning that promotes flatter loss landscapes to enhance stability and performance, with an efficient variant reducing computational costs.

Contribution

The paper proposes C-Flat, a novel method promoting flatter minima in continual learning, and C-Flat++, an efficient extension reducing update costs while maintaining effectiveness.

Findings

01

C-Flat improves performance across various continual learning settings.

02

C-Flat++ significantly reduces update costs compared to C-Flat.

03

The methods are compatible with multiple CL paradigms and datasets.

Abstract

Balancing sensitivity to new tasks and stability for retaining past knowledge is crucial in continual learning (CL). Recently, sharpness-aware minimization has proven effective in transfer learning and has also been adopted in continual learning (CL) to improve memory retention and learning efficiency. However, relying on zeroth-order sharpness alone may favor sharper minima over flatter ones in certain settings, leading to less robust and potentially suboptimal solutions. In this paper, we propose \textbf{C}ontinual \textbf{Flat}ness (\textbf{C-Flat}), a method that promotes flatter loss landscapes tailored for CL. C-Flat offers plug-and-play compatibility, enabling easy integration with minimal modifications to the code pipeline. Besides, we present a general framework that integrates C-Flat into all major CL paradigms and conduct comprehensive comparisons with loss-minima optimizers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.