Dissecting Continual Learning a Structural and Data Analysis
Francesco Pelosin

TL;DR
This paper analyzes continual learning challenges, especially catastrophic forgetting, and explores data, architecture, and pretraining factors affecting performance, proposing novel methods and raising future research questions.
Contribution
It provides a comprehensive analysis of continual learning, introduces a novel asymmetric loss for ViTs, and investigates the impact of data quantity and pretraining on continual learning.
Findings
Data quantity in rehearsal buffers is more crucial than data quality.
Proposed a novel asymmetric loss for Vision Transformers.
Pretraining significantly influences continual learning performance.
Abstract
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
