A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need
Hananel Hazan, Yanbo Zhang, Benedikt Hartl, Michael Levin

TL;DR
This paper shows that training only low-rank adapters on randomly initialized, frozen backbones can nearly match full model performance across diverse tasks, revealing the task signal's small subspace.
Contribution
Introducing LottaLoRA, a paradigm where frozen random backbones combined with trainable low-rank adapters achieve near full performance, reducing training parameters significantly.
Findings
Adapters recover 96-100% of full model performance.
Frozen backbone is actively exploited and interchangeable.
LoRA rank estimates the intrinsic task dimensionality.
Abstract
How many of a neural network's parameters actually encode task-specific information? We investigate this question with LottaLoRA, a training paradigm in which every backbone weight is drawn at random and frozen; only low-rank LoRA adapters are trained. Across nine benchmarks spanning diverse architecture families from single-layer classifiers to 900M parameter Transformers low-rank adapters over frozen random backbones recover 96-100% of fully trained performance while training only 0.5-40% of the parameters. The task-specific signal therefore occupies a subspace orders of magnitude smaller than the full parameter count suggests. Three mechanistic findings underpin this result:(1) the frozen backbone is actively exploited when static the learned scaling~ remains strictly positive across all architectures but when the scaffold is destabilized, the optimizer silences it and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
