Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Can Yaras, Peng Wang, Laura Balzano, Qing Qu

TL;DR
This paper introduces a method to leverage low-dimensional structures in overparameterized models, enabling efficient training and adaptation in deep learning tasks like matrix completion and language model fine-tuning.
Contribution
It presents a theoretical and practical framework for constructing compressed, low-rank models that retain the benefits of overparameterization, including a new technique called Deep LoRA for language models.
Findings
Improved training efficiency in deep matrix completion.
Reduced overfitting and hyperparameter complexity in language model fine-tuning.
Maintained performance with highly compressed models.
Abstract
While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of overparameterization without the computational burdens. In practice, we demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models. Our approach is grounded in theoretical findings for deep overparameterized low-rank matrix recovery, where we show that the learning dynamics of each weight matrix are confined to an invariant low-dimensional subspace. Consequently, we can construct and train compact, highly compressed factorizations possessing the same benefits as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
