ShapLoRA: Allocation of Low-rank Adaption on Large Language Models via Shapley Value Inspired Importance Estimation
Yi Zhao, Qinghua Yao, Xinyuan song, Wei Zhu

TL;DR
ShapLoRA introduces a Shapley value-inspired importance measure for more effective rank allocation in low-rank adaptation of large language models, leading to improved performance over existing methods.
Contribution
The paper proposes ShapLoRA, a novel explainable importance measure based on Shapley values, enhancing rank allocation in LoRA for better LLM fine-tuning.
Findings
Outperforms recent baselines with similar parameters
Uses Shapley sensitivity for explainable importance estimation
Demonstrates effectiveness on various challenging tasks
Abstract
Low-rank adaption (LoRA) is a representative method in the field of parameter-efficient fine-tuning (PEFT), and is key to Democratizating the modern large language models (LLMs). The vanilla LoRA is implemented with uniform ranks, and the recent literature have found that properly allocating ranks on the LLM backbones results in performance boosts. However, the previous rank allocation methods have limitations since they rely on inexplanable and unreliable importance measures for the LoRA ranks. To address the above issues, we propose the ShapLoRA framework. Inspired by the explanable attribution measure Shapley Value, we combine the sensitivity-based measures with the idea of coalitions in the collaborative games among LoRA ranks, and propose a more explainable importance measure called Shapley sensitivity. In addition, we optimize the workflow of the existing works by: (a) calculating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Mobile Crowdsensing and Crowdsourcing · Domain Adaptation and Few-Shot Learning
