Architecture Matters in Continual Learning

Seyed Iman Mirzadeh; Arslan Chaudhry; Dong Yin; Timothy Nguyen; Razvan; Pascanu; Dilan Gorur; Mehrdad Farajtabar

arXiv:2202.00275·cs.LG·February 2, 2022·25 cites

Architecture Matters in Continual Learning

Seyed Iman Mirzadeh, Arslan Chaudhry, Dong Yin, Timothy Nguyen, Razvan, Pascanu, Dilan Gorur, Mehrdad Farajtabar

PDF

Open Access

TL;DR

This paper investigates how different neural network architectures influence continual learning performance, revealing that architectural choices significantly affect the balance between remembering past tasks and learning new ones.

Contribution

It demonstrates the importance of architecture selection in continual learning and provides practical guidelines for architectural decisions to enhance performance.

Findings

01

Architectural choices impact continual learning trade-offs.

02

Different architectures lead to varying forgetting and learning capabilities.

03

Recommendations for architecture design improve continual learning outcomes.

Abstract

A large body of research in continual learning is devoted to overcoming the catastrophic forgetting of neural networks by designing new algorithms that are robust to the distribution shifts. However, the majority of these works are strictly focused on the "algorithmic" part of continual learning for a "fixed neural network architecture", and the implications of using different architectures are mostly neglected. Even the few existing continual learning methods that modify the model assume a fixed architecture and aim to develop an algorithm that efficiently uses the model throughout the learning experience. However, in this work, we show that the choice of architecture can significantly impact the continual learning performance, and different architectures lead to different trade-offs between the ability to remember previous tasks and learning new ones. Moreover, we study the impact of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning