Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation
Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Ziqiang Cui, Dugang Liu, Yuhua Li, Xiuqiang He, Ruixuan Li

TL;DR
This paper introduces TopLoRA, a token-wise low-rank adaptation method that dynamically adjusts weights for each input token, significantly improving fine-tuning efficiency and performance in large language models.
Contribution
TopLoRA extends standard LoRA by enabling token-wise input-output projections without increasing rank, enhancing model adaptation granularity and effectiveness.
Findings
TopLoRA outperforms standard LoRA across multiple models and datasets.
Token-wise projections improve fine-tuning efficiency and accuracy.
The method does not increase the overall rank of LoRA weights.
Abstract
Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). LoRA essentially describes the projection of an input space into a low-dimensional output space, with the dimensionality determined by the LoRA rank. In standard LoRA, all input tokens share the same weights and undergo an identical input-output projection. This limits LoRA's ability to capture token-specific information due to the inherent semantic differences among tokens. To address this limitation, we propose Token-wise Projected Low-Rank Adaptation (TopLoRA), which dynamically adjusts LoRA weights according to the input token, thereby learning token-wise input-output projections in an end-to-end manner. Formally, the weights of TopLoRA can be expressed as , where and are low-rank matrices (as in standard LoRA), and is a diagonal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
