TL;DR
HypeLoRA introduces a hyper-network-based framework for generating LoRA adapters, improving calibration and parameter efficiency in Transformer fine-tuning, with comprehensive evaluation and open-source code.
Contribution
It proposes a novel hyper-network approach for LoRA adaptation, enhancing calibration and efficiency in language model fine-tuning.
Findings
LoRA-based adaptation matches or exceeds full fine-tuning calibration.
Hyper-network-generated LoRA factors achieve similar or better performance.
Constraining adaptation space improves calibration but may reduce accuracy.
Abstract
Modern Transformer-based models frequently suffer from miscalibration, producing overconfident predictions that do not reflect true empirical frequencies. This work investigates the calibration dynamics of LoRA: Low-Rank Adaptation and a novel hyper-network-based adaptation framework as parameter-efficient alternatives to full fine-tuning for RoBERTa. Evaluating across the GLUE benchmark, we demonstrate that LoRA-based adaptation consistently achieves calibration parity with (and in specific tasks exceeds) full fine-tuning, while maintaining significantly higher parameter efficiency. We further explore a dynamic approach where a shared hyper-network generates LoRA factors (A and B matrices) to induce structural coupling across layers. This approach produced results similar to standard LoRA fine-tuning, even achieving better MCC on CoLA dataset. Our study also reveal a critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
