Hybrid and Unitary PEFT for Resource-Efficient Large Language Models
Haomin Qi, Zihan Dai, Chengbo Huang

TL;DR
This paper introduces a hybrid PEFT strategy combining multiple techniques and adapts uRNN principles to Transformer LLMs, achieving efficient fine-tuning with reduced resources and maintained performance.
Contribution
It proposes a novel hybrid PEFT method with adaptive per-layer updates and extends uRNN concepts to Transformers, improving resource efficiency and stability in LLM fine-tuning.
Findings
Hybrid approach improves convergence and generalization.
Reduces training time by ~2.1x and memory by ~50%.
Achieves near full fine-tuning quality across diverse tasks.
Abstract
Fine-tuning large language models (LLMs) remains a computational bottleneck due to their scale and memory demands. This paper presents a comprehensive evaluation of parameter-efficient fine-tuning (PEFT) techniques, including LoRA, BOFT, LoRA-GA, and uRNN, and introduces a novel hybrid strategy that dynamically integrates BOFT's orthogonal stability with LoRA-GA's gradient-aligned rapid convergence. By computing per-layer adaptive updates guided by gradient norms, the hybrid method achieves superior convergence efficiency and generalization across diverse tasks. We also explore, for the first time, the adaptation of unitary RNN (uRNN) principles to Transformer-based LLMs, enhancing gradient stability through structured unitary constraints. Across GLUE, GSM8K, MT-Bench, and HumanEval, using models ranging from 7B to 405B parameters, the hybrid approach yields consistent gains across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
