Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms
Omatharv Bharat Vaidya, Connor T. Jerzak, Nhat Ho, Chandrajit Bajaj

TL;DR
This paper introduces Queryable LoRA, a parameter-efficient fine-tuning method that uses shared, content-dependent low-rank update atoms with instruction-regularization, improving adaptability and stability in large neural networks.
Contribution
It proposes a novel routing mechanism over shared low-rank update atoms with instruction-regularization, enabling dynamic, input-dependent model updates while maintaining efficiency.
Findings
Improved test performance over standard LoRA in experiments.
Enhanced training stability with the new routing approach.
Maintains parameter efficiency comparable to existing methods.
Abstract
We present a data-adaptive method for parameter-efficient fine-tuning of large neural networks. Standard low-rank adaptation methods improve efficiency by restricting each layer update to a fixed low-rank form, but this static parameterization can be too rigid when the appropriate correction depends on the input and on the evolving depth-wise computation of the network. Our approach replaces a purely layer-local adapter with a shared queryable memory of low-rank update atoms. For each block of layers, the model forms a query from the current low-rank state and a running summary of previous blocks, uses this query to retrieve a content-dependent combination of shared update components via attention, and applies the resulting routed operator within the low-rank bottleneck. In this way, the method retains the efficiency and scalability of low-rank adaptation while allowing the effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
