LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization

Yupeng Chang; Chenlu Guo; Yi Chang; and Yuan Wu

arXiv:2502.14538·cs.CL·September 29, 2025

LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization

Yupeng Chang, Chenlu Guo, Yi Chang, and Yuan Wu

PDF

Open Access 1 Video

TL;DR

LoRA-MGPO introduces a momentum-guided perturbation optimization framework to stabilize low-rank adaptation of large language models, effectively mitigating double descent and improving training stability and model performance.

Contribution

The paper proposes LoRA-MGPO, a novel method that stabilizes low-rank adaptation training by guiding perturbations with momentum and adaptive normalization, reducing double descent effects.

Findings

01

LoRA-MGPO achieves better performance than existing PEFT methods.

02

It results in smoother loss curves and faster convergence.

03

The method improves generalization by avoiding sharp minima.

Abstract

Parameter-efficient fine-tuning (PEFT), particularly Low-Rank Adaptation (LoRA), adapts large language models (LLMs) by training only a small fraction of parameters. However, as the rank of the low-rank matrices used for adaptation increases, LoRA often exhibits an unstable "double descent" phenomenon, characterized by transient divergence in the training loss, which delays convergence and impairs generalization by causing instability due to the attraction to sharp local minima. To address this, we introduce LoRA-MGPO, a framework that incorporates Momentum-Guided Perturbation Optimization (MGPO). MGPO stabilizes training dynamics by mitigating the double descent phenomenon and guiding weight perturbations using momentum vectors from the optimizer's state, thus avoiding dual gradient computations. Additionally, an adaptive normalization scheme scales the magnitude of perturbations based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LoRA-MGPO: Mitigating Double Descent in Low-Rank Adaptation via Momentum-Guided Perturbation Optimization· underline

Taxonomy

TopicsInertial Sensor and Navigation · Space Satellite Systems and Control · Underwater Vehicles and Communication Systems