Flexora: Flexible Low Rank Adaptation for Large Language Models
Chenxing Wei, Yao Shu, Ying Tiffany He, Fei Richard Yu

TL;DR
Flexora introduces a method to automatically select the most important layers for fine-tuning in large language models, improving performance over existing methods by framing layer selection as a hyperparameter optimization problem.
Contribution
The paper proposes Flexora, a novel approach that uses hyperparameter optimization and unrolled differentiation to adaptively select layers for fine-tuning in LLMs, enhancing task-specific performance.
Findings
Flexora consistently outperforms baseline methods across various models and tasks.
The method effectively reduces overfitting by selecting relevant layers for fine-tuning.
Theoretical analysis supports the effectiveness of the layer selection strategy.
Abstract
Large Language Models (LLMs) are driving advancements in artificial intelligence by increasing the scale of model parameters, which has significantly enhanced generalization ability and unlocked new capabilities in practice. However, their performance in specific downstream tasks is usually hindered by their knowledge boundaries on these tasks. Thus, fine-tuning techniques, especially the widely used Low-Rank Adaptation (LoRA) method, have been introduced to expand the boundaries on these tasks, whereas LoRA would underperform on certain tasks owing to its potential overfitting on these tasks. To overcome this overfitting and improve the performance of LoRA, we propose the flexible low rank adaptation (Flexora) method to automatically and flexibly select the most important layers needing to be fine-tuned to achieve the best performance on different downstream tasks. Specifically,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling
