Evaluating Parameter Efficient Learning for Generation
Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan J., Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro

TL;DR
This paper compares parameter efficient learning methods to finetuning for language models across various settings, showing PERMs can outperform finetuning in low-data and cross-domain scenarios, and achieve state-of-the-art results.
Contribution
It provides a comprehensive evaluation of PERMs versus finetuning across sample sizes, domains, and datasets, highlighting when PERMs are more effective and demonstrating new state-of-the-art results with Adapter.
Findings
PERMs outperform finetuning with fewer samples in in-domain settings.
Adapter performs best among PERMs in cross-domain evaluations.
PERMs achieve higher faithfulness scores than finetuning, especially with small datasets.
Abstract
Parameter efficient learning methods (PERMs) have recently gained significant attention as they provide an efficient way for pre-trained language models (PLMs) to adapt to a downstream task. However, these conclusions are mostly drawn from in-domain evaluations over the full training set. In this paper, we present comparisons between PERMs and finetuning from three new perspectives: (1) the effect of sample and model size to in-domain evaluations, (2) generalization to unseen domains and new datasets, and (3) the faithfulness of generations. Our results show that for in-domain settings (a) there is a cross point of sample size for which PERMs will perform better than finetuning when training with fewer samples, and (b) larger PLMs have larger cross points. For cross-domain and cross-dataset cases, we show that (a) Adapter (Houlsby et al., 2019) performs the best amongst all the PERMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
MethodsAdapter
