Continual Variational Autoencoder Learning via Online Cooperative Memorization
Fei Ye, Adrian G. Bors

TL;DR
This paper introduces a theoretical framework for understanding catastrophic forgetting in VAEs during continual learning and proposes the Online Cooperative Memorization (OCM) method with memory buffers to improve knowledge retention and generation quality.
Contribution
It develops a novel theoretical analysis of VAE forgetting using optimal transport and introduces the OCM framework with memory buffers to mitigate forgetting in continual learning.
Findings
Theoretical bounds on data likelihood without task info
OCM effectively preserves sample diversity
Enhanced VAE performance with dynamic expansion network
Abstract
Due to their inference, data representation and reconstruction properties, Variational Autoencoders (VAE) have been successfully used in continual learning classification tasks. However, their ability to generate images with specifications corresponding to the classes and databases learned during Continual Learning (CL) is not well understood and catastrophic forgetting remains a significant challenge. In this paper, we firstly analyze the forgetting behaviour of VAEs by developing a new theoretical framework that formulates CL as a dynamic optimal transport problem. This framework proves approximate bounds to the data likelihood without requiring the task information and explains how the prior knowledge is lost during the training process. We then propose a novel memory buffering approach, namely the Online Cooperative Memorization (OCM) framework, which consists of a Short-Term Memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · Multimodal Machine Learning Applications
