Make Continual Learning Stronger via C-Flat
Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Mang Wang, Zixiang Zhao,, Aojun Lu, Pengliang Ji, Tao Feng

TL;DR
This paper introduces C-Flat, a simple and effective method to improve continual learning by promoting flatter loss landscapes, which enhances model generalization across sequential tasks.
Contribution
The paper proposes a novel, easy-to-implement flatness-based regularization method for continual learning, compatible with various CL approaches and improving performance.
Findings
C-Flat improves CL performance in most cases.
It is a plug-and-play method requiring only one line of code.
C-Flat enhances model generalization by promoting flatter minima.
Abstract
Model generalization ability upon incrementally acquiring dynamically updating knowledge from sequentially arriving tasks is crucial to tackle the sensitivity-stability dilemma in Continual Learning (CL). Weight loss landscape sharpness minimization seeking for flat minima lying in neighborhoods with uniform low loss or smooth gradient is proven to be a strong training regime improving model generalization compared with loss minimization based optimizer like SGD. Yet only a few works have discussed this training regime for CL, proving that dedicated designed zeroth-order sharpness optimizer can improve CL performance. In this work, we propose a Continual Flatness (C-Flat) method featuring a flatter loss landscape tailored for CL. C-Flat could be easily called with only one line of code and is plug-and-play to any CL methods. A general framework of C-Flat applied to all CL categories and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics · Educational Technology and Assessment
MethodsStochastic Gradient Descent
