LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation
Hongyun Zhou, Xiangyu Lu, Wang Xu, Conghui Zhu, Tiejun Zhao, Muyun, Yang

TL;DR
LoRA-drop is a novel method that prunes LoRA parameters based on their output importance, reducing resource consumption while maintaining performance in fine-tuning large models.
Contribution
It introduces a new importance evaluation based on LoRA output, enabling effective parameter pruning during fine-tuning.
Findings
Achieves comparable performance to full fine-tuning and LoRA.
Retains 50% of LoRA parameters on average.
Effective across various model scales and NLP tasks.
Abstract
Low-Rank Adaptation (LoRA) is currently the most commonly used Parameter-efficient fine-tuning (PEFT) method, it introduces auxiliary parameters for each layer to fine-tune the pre-trained model under limited computing resources. However, it still faces resource consumption challenges during training when scaling up to larger models. Most previous studies have tackled this issue by using pruning techniques, which involve removing LoRA parameters deemed unimportant. Nonetheless, these efforts only analyze LoRA parameter features to evaluate their importance, such as parameter count, size, and gradient. In fact, the output of LoRA (product of LoRA parameter and hidden state), directly impacts the final results. Preliminary experiments indicate that a fraction of LoRA elements possesses significantly high output values, substantially influencing the layer output. Motivated by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks · Underwater Vehicles and Communication Systems · Robotics and Automated Systems
MethodsPruning
