LAMOL: LAnguage MOdeling for Lifelong Language Learning
Fan-Keng Sun, Cheng-Hao Ho, and Hung-Yi Lee

TL;DR
LAMOL introduces a language modeling approach for lifelong language learning that effectively prevents catastrophic forgetting by generating pseudo-samples of previous tasks without extra memory, outperforming prior methods.
Contribution
The paper presents LAMOL, a novel language modeling method that enables lifelong learning in language tasks without additional memory or capacity, and effectively mitigates catastrophic forgetting.
Findings
LAMOL outperforms previous lifelong learning methods in language tasks.
LAMOL achieves near-multitask performance with only 2-3% gap.
LAMOL successfully learns five diverse language tasks sequentially.
Abstract
Most research on lifelong learning applies to images or games, but not language. We present LAMOL, a simple yet effective method for lifelong language learning (LLL) based on language modeling. LAMOL replays pseudo-samples of previous tasks while requiring no extra memory or model capacity. Specifically, LAMOL is a language model that simultaneously learns to solve the tasks and generate training samples. When the model is trained for a new task, it generates pseudo-samples of previous tasks for training alongside data for the new task. The results show that LAMOL prevents catastrophic forgetting without any sign of intransigence and can perform five very different language tasks sequentially with only one model. Overall, LAMOL outperforms previous methods by a considerable margin and is only 2-3% worse than multitasking, which is usually considered the LLL upper bound. The source code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling
