Rethinking the Rank Threshold for LoRA Fine-Tuning

Juneyoung Park

arXiv:2605.03724·cs.LG·May 6, 2026

Rethinking the Rank Threshold for LoRA Fine-Tuning

Juneyoung Park

PDF

TL;DR

This paper demonstrates that for binary classification in LoRA fine-tuning, the necessary rank can be reduced to one, improving upon previous higher-rank requirements through theoretical analysis and empirical validation.

Contribution

It provides new theoretical insights that lower the rank threshold to one for binary classification in LoRA fine-tuning, supported by empirical results across multiple tasks.

Findings

01

Rank one suffices for binary classification in the NTK regime.

02

Polyak–Lojasiewicz inequality removes the rank threshold in cross-entropy.

03

Empirical results show rank one is competitive on binary tasks.

Abstract

A recent landscape analysis of LoRA fine-tuning in the neural tangent kernel regime establishes a sufficient condition $r (r + 1) /2 > K N$ on the LoRA rank $r$ for the absence of spurious local minima under squared-error loss, prescribing $r \geq 12$ on canonical few-shot RoBERTa setups. The condition is stated for general output dimension $K$ , so its sharpness in any particular regime, and its practical implication for the cross-entropy loss actually used in fine-tuning, are open. We give three results that together reduce the prescribed rank to $r = 1$ for binary classification in this regime. First, replacing the symmetric Sard-form count with the non-symmetric LoRA manifold dimension yields a strictly weaker capacity requirement, $r (m + n) - r^{2} > C^{*} \cdot K N$ with $C^{*} \approx 1.35$ under Gaussian-iid features, satisfied at $r = 1$ on canonical setups. Second, in the cross-entropy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.