Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models
Aradhye Agarwal, Suhas K Ramesh, Ayan Sengupta, Tanmoy Chakraborty

TL;DR
This paper introduces $ ext{ID}^3$, a dynamic parameter importance method for selective PEFT that improves efficiency and performance in fine-tuning large language models across various tasks.
Contribution
The paper proposes $ ext{ID}^3$, a novel dynamic importance scoring technique that adapts parameter selection during fine-tuning, outperforming fixed heuristics.
Findings
Reduces gradient updates by a factor of two.
Demonstrates superior performance on 16 NLP tasks.
Compatible with existing PEFT methods like adapters and LoRA.
Abstract
Fine-tuning large language models (LLMs) on downstream tasks requires substantial computational resources. Selective PEFT, a class of parameter-efficient fine-tuning (PEFT) methodologies, aims to mitigate these computational challenges by selectively fine-tuning only a small fraction of the model parameters. Although parameter-efficient, these techniques often fail to match the performance of fully fine-tuned models, primarily due to inherent biases introduced during parameter selection. Traditional selective PEFT techniques use a fixed set of parameters selected using different importance heuristics, failing to capture parameter importance dynamically and often leading to suboptimal performance. We introduce , a novel selective PEFT method that calculates parameter importance continually, and dynamically unmasks parameters by balancing exploration and exploitation in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques
MethodsSparse Evolutionary Training
