Parameter Efficient Instruction Tuning: An Empirical Study
Pengfei He

TL;DR
This paper empirically compares parameter-efficient finetuning methods, especially LoRA and adapters, to full finetuning, analyzing their performance, stability, and generalization across various tasks and settings.
Contribution
It provides a systematic evaluation of PEFT methods, highlighting conditions for optimal performance and limitations in complex tasks.
Findings
LoRA and adapters can approach full finetuning performance under ideal settings.
LoRA and adapters are less stable without optimal hyperparameters.
LoRA requires more tasks for effective unseen task generalization.
Abstract
Instruction tuning has become an important step for finetuning pretrained language models to better follow human instructions and generalize on various tasks. Nowadays, pretrained language models become increasingly larger, and full parameter finetuning is overwhelmingly costly. Therefore, Parameter Efficient Finetuning (PEFT) has arisen as a cost-effective practice for instruction tuning because of significantly smaller computational, memory, and storage cost compared to full finetuning. Despite their widespread adaptations, the vast hyperparameter spaces, the number of PEFT methods, the different focus of instruction tuning capabilities make disentangling the impact of each aspect difficult. This study systematically investigates several representative PEFT methods, surveying the effect of hyperparameter choices including training hyperparameters and PEFT-specific hyperparameters, how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExperimental Learning in Engineering
MethodsAdapter · Focus
