Enhancing Parameter Efficiency and Generalization in Large-Scale Models: A Regularized and Masked Low-Rank Adaptation Approach
Yuzhu Mao, Siqi Ping, Zihao Zhao, Yang Liu, Wenbo Ding

TL;DR
This paper introduces RM-LoRA, a novel approach that enhances parameter efficiency and generalization in large-scale models by regularizing and masking low-rank adaptations, outperforming existing methods.
Contribution
The paper proposes RM-LoRA, which employs regularization and gradient masking to increase intrinsic dimension, improving performance and reducing overfitting in low-rank model adaptation.
Findings
RM-LoRA outperforms original LoRA and variants across multiple datasets.
Regularization and masking increase intrinsic dimension and improve generalization.
RM-LoRA maintains or reduces parameter count while enhancing results.
Abstract
Large pre-trained models, such as large language models (LLMs), present significant resource challenges for fine-tuning due to their extensive parameter sizes, especially for applications in mobile systems. To address this, Low-Rank Adaptation (LoRA) has been developed to reduce resource consumption while maintaining satisfactory fine-tuning results. Despite its effectiveness, the original LoRA method faces challenges of suboptimal performance and overfitting. This paper investigates the intrinsic dimension of the matrix updates approximated by the LoRA method and reveals the performance benefits of increasing this intrinsic dimension. By employing regularization and a gradient masking method that encourages higher intrinsic dimension, the proposed method, termed Regularized and Masked LoRA (RM-LoRA), achieves superior generalization performance with the same or lower trainable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications
