TL;DR
This paper critically re-evaluates parameter-efficient tuning methods for pretrained language models, revealing that under fair evaluation protocols, they are not consistently better than full finetuning, especially in medium- and high-resource scenarios.
Contribution
It provides the first comprehensive investigation into PETuning, highlighting issues with current evaluation practices and identifying key factors affecting their stability and performance.
Findings
Current evaluation practices are unreliable for PETuning.
Full finetuning remains superior in medium- and high-resource settings.
Reducing trainable parameters and increasing training iterations improve PETuning stability.
Abstract
Parameter-Efficient Tuning (PETuning) methods have been deemed by many as the new paradigm for using pretrained language models (PLMs). By tuning just a fraction amount of parameters comparing to full model finetuning, PETuning methods claim to have achieved performance on par with or even better than finetuning. In this work, we take a step back and re-examine these PETuning methods by conducting the first comprehensive investigation into the training and evaluation of them. We found the problematic validation and testing practice in current studies, when accompanied by the instability nature of PETuning methods, has led to unreliable conclusions. When being compared under a truly fair evaluation protocol, PETuning cannot yield consistently competitive performance while finetuning remains to be the best-performing method in medium- and high-resource settings. We delve deeper into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
