An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
Alexandra Chronopoulou, Christos Baziotis, Alexandros Potamianos

TL;DR
This paper introduces a simple, effective transfer learning method that combines task-specific training with an auxiliary language model objective, preserving language knowledge while adapting to new tasks in a single end-to-end training step.
Contribution
It proposes a straightforward approach that mitigates catastrophic forgetting without separate pretraining or finetuning, outperforming complex existing transfer learning methods.
Findings
Outperforms established transfer learning methods on various tasks
Does not require pretraining or separate finetuning steps
Achieves better results with a simpler, end-to-end training process
Abstract
A growing number of state-of-the-art transfer learning methods employ language models pretrained on large generic corpora. In this paper we present a conceptually simple and effective transfer learning approach that addresses the problem of catastrophic forgetting. Specifically, we combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process. This preserves language regularities captured by language models, while enabling sufficient adaptation for solving the target task. Our method does not require pretraining or finetuning separate components of the network and we train our models end-to-end in a single step. We present results on a variety of challenging affective and text classification tasks, surpassing well established transfer learning methods with greater level of complexity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
