Where Should LoRA Go? Component-Type Placement in Hybrid Language Models
Hector Borobia, Elies Segu\'i-Mas, Guillermina Tormo-Carb\'o

TL;DR
This paper investigates how the placement of LoRA adapters in hybrid language models affects performance, revealing that component-specific placement significantly improves adaptation efficiency and transferability.
Contribution
It systematically studies component-type LoRA placement in hybrid models, demonstrating the importance of topology-aware adaptation strategies.
Findings
Attention pathway outperforms full-model adaptation with fewer parameters.
Recurrent backbone adaptation is harmful in sequential hybrids but beneficial in parallel hybrids.
Parallel hybrids show positive transfer; sequential hybrids suffer catastrophic forgetting.
Abstract
Hybrid language models that interleave attention with recurrent components are increasingly competitive with pure Transformers, yet standard LoRA practice applies adapters uniformly without considering the distinct functional roles of each component type. We systematically study component-type LoRA placement across two hybrid architectures -- Qwen3.5-0.8B (sequential, GatedDeltaNet + softmax attention) and Falcon-H1-0.5B (parallel, Mamba-2 SSM + attention) -- fine-tuned on three domains and evaluated on five benchmarks. We find that the attention pathway -- despite being the minority component -- consistently outperforms full-model adaptation with 5-10x fewer trainable parameters. Crucially, adapting the recurrent backbone is destructive in sequential hybrids (-14.8 pp on GSM8K) but constructive in parallel ones (+8.6 pp). We further document a transfer asymmetry: parallel hybrids…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
