Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

Vladislav Lialin; Vijeta Deshpande; Xiaowei Yao; Anna Rumshisky

arXiv:2303.15647·cs.CL·November 25, 2024·69 cites

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky

PDF

Open Access 2 Repos

TL;DR

This paper systematically reviews parameter-efficient fine-tuning methods for large language models, comparing 15 approaches on models up to 11B parameters, and offers practical recommendations and future research directions.

Contribution

It provides a comprehensive taxonomy, extensive experimental comparison, and practical guidelines for PEFT methods in resource-constrained settings.

Findings

01

Methods struggle in resource-limited scenarios

02

Hyperparameter tuning impacts performance significantly

03

Some methods outperform baseline in specific conditions

Abstract

This paper presents a systematic overview of parameter-efficient fine-tuning methods, covering over 50 papers published between early 2019 and mid-2024. These methods aim to address the challenges of fine-tuning large language models by training only a small subset of parameters. We provide a taxonomy that covers a broad range of methods and present a detailed method comparison with a specific focus on real-life efficiency in fine-tuning multibillion-scale language models. We also conduct an extensive head-to-head experimental comparison of 15 diverse PEFT methods, evaluating their performance and efficiency on models up to 11B parameters. Our findings reveal that methods previously shown to surpass a strong LoRA baseline face difficulties in resource-constrained settings, where hyperparameter optimization is limited and the network is fine-tuned only for a few epochs. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis