Natural Gradient Descent for Online Continual Learning
Joe Khawand, David Colliaux

TL;DR
This paper introduces a Natural Gradient Descent optimizer with Fisher Information Matrix approximation to improve online continual learning, significantly reducing catastrophic forgetting and enhancing convergence on image classification tasks.
Contribution
It proposes a novel optimization approach using Natural Gradient Descent with KFAC approximation, improving performance in online continual learning scenarios.
Findings
Enhanced accuracy across multiple datasets
Significant reduction in catastrophic forgetting
Improved convergence speed in OCL models
Abstract
Online Continual Learning (OCL) for image classification represents a challenging subset of Continual Learning, focusing on classifying images from a stream without assuming data independence and identical distribution (i.i.d). The primary challenge in this context is to prevent catastrophic forgetting, where the model's performance on previous tasks deteriorates as it learns new ones. Although various strategies have been proposed to address this issue, achieving rapid convergence remains a significant challenge in the online setting. In this work, we introduce a novel approach to training OCL models that utilizes the Natural Gradient Descent optimizer, incorporating an approximation of the Fisher Information Matrix (FIM) through Kronecker Factored Approximate Curvature (KFAC). This method demonstrates substantial improvements in performance across all OCL methods, particularly when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Data Stream Mining Techniques · Machine Learning and ELM
