RVAE-LAMOL: Residual Variational Autoencoder to Enhance Lifelong Language Learning
Han Wang, Ruiliu Fu, Xuejun Zhang, Jun Zhou

TL;DR
This paper introduces RVAE-LAMOL, a novel model that enhances lifelong language learning by reducing catastrophic forgetting through a residual variational autoencoder and a new training scheme, improving task retention and sample quality.
Contribution
The paper proposes RVAE-LAMOL, combining residual variational autoencoders with LAMOL, and introduces an alternate lag training scheme to better preserve knowledge across NLP tasks.
Findings
RVAE-LAMOL outperforms baseline LAMOL on multiple dataset permutations.
The model generates more meaningful and accurate pseudo-samples.
Experimental results show improved task retention and discrimination.
Abstract
Lifelong Language Learning (LLL) aims to train a neural network to learn a stream of NLP tasks while retaining knowledge from previous tasks. However, previous works which followed data-free constraint still suffer from catastrophic forgetting issue, where the model forgets what it just learned from previous tasks. In order to alleviate catastrophic forgetting, we propose the residual variational autoencoder (RVAE) to enhance LAMOL, a recent LLL model, by mapping different tasks into a limited unified semantic space. In this space, previous tasks are easy to be correct to their own distribution by pseudo samples. Furthermore, we propose an identity task to make the model is discriminative to recognize the sample belonging to which task. For training RVAE-LAMOL better, we propose a novel training scheme Alternate Lag Training. In the experiments, we test RVAE-LAMOL on permutations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
