AdaPaD: Adaptive Parallel Deflation for PEFT with Self-Correcting Rank Discovery

Barbara Su; Fangshuo Liao; Anastasios Kyrillidis

arXiv:2605.10741·cs.LG·May 12, 2026

AdaPaD: Adaptive Parallel Deflation for PEFT with Self-Correcting Rank Discovery

Barbara Su, Fangshuo Liao, Anastasios Kyrillidis

PDF

TL;DR

AdaPaD introduces a novel method for simultaneously training low-rank components in language models, enabling adaptive, self-correcting rank discovery with proven error decay and competitive empirical results.

Contribution

The paper proposes AdaPaD, a self-correcting, adaptive parallel deflation method for PEFT that trains all rank-1 components simultaneously with dynamic rank discovery.

Findings

01

Error of each component decays exponentially after warm-up.

02

AdaPaD achieves competitive performance on GLUE and SQuAD benchmarks.

03

Adapter size is reduced by an average of 30.7% compared to fixed-rank LoRA.

Abstract

Fine-tuning large language models with LoRA requires choosing a rank r before training starts. Existing approaches either extract rank-1 components sequentially, freezing each component's error permanently into every subsequent residual, or optimize the full low-rank factorization jointly with guarantees that describe only the joint update, not individual rank-1 directions. We present AdaPaD (Adaptive Parallel Deflation), which trains all rank-1 components simultaneously: each worker refines its component against a deflation target built from the latest estimates of all predecessors, and as those estimates improve, the targets improve too. We call this property self-correction: deflation errors converge to zero over rounds rather than persisting as fixed residuals. On top of this backbone, AdaPaD adds advance learning (private pre-training before activation) and per-module dynamic rank…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.