Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training

Chengqian Zhang; Wei Zhu; Kyumin Lee

arXiv:2605.18822·cs.LG·May 20, 2026

Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training

Chengqian Zhang, Wei Zhu, Kyumin Lee

PDF

TL;DR

Hybrid-LoRA is a novel post-training framework that combines full fine-tuning and Low-Rank Adaptation to efficiently adapt large language models for complex reasoning tasks, achieving near full fine-tuning performance with reduced costs.

Contribution

It introduces a Hybrid-LoRA Score to selectively apply full fine-tuning and LoRA, significantly improving performance over existing PEFT methods in post-training for reasoning tasks.

Findings

01

Hybrid-LoRA matches full fine-tuning performance with only 10% of modules fully fine-tuned.

02

It outperforms four state-of-the-art PEFT post-training baselines by up to 5.65%.

03

Hybrid-LoRA achieves an average improvement of 4.36% over the best baseline.

Abstract

Post-training has become essential for adapting large language models (LLMs) to complex downstream behaviors, including instruction following, preference alignment, and multi-step reasoning. Reinforcement learning with verifiable rewards (RLVR) has recently emerged as a particularly effective post-training paradigm for improving reasoning capabilities, with critic-free algorithms such as GRPO and GSPO enabling scalable optimization. However, RLVR post-training with full fine-tuning (FFT) requires substantial GPU memory and incurs high training costs. Although parameter-efficient fine-tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA), effectively reduce computational costs, they often suffer from a noticeable performance gap compared to full fine-tuning in post-training for complex reasoning tasks. In this paper, we propose Hybrid-LoRA, an efficient hybrid post-training framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.