Less is More: Resource-Efficient Low-Rank Adaptation
Chunlin Tian, Xuyang Wei, Huanrong Liu, Zhijiang Guo, Li Li

TL;DR
EffiLoRA is a novel, resource-efficient adaptation method for large models that reduces overhead by sharing parameters and selectively updating matrices, improving performance across multiple modalities.
Contribution
Introduces EffiLoRA, a lightweight, generalizable low-rank adaptation method that employs a unified matrix and dynamic updates to enhance efficiency and robustness.
Findings
Outperforms LoRA in diverse tasks and modalities
Reduces training costs and parameter interference
Demonstrates improved robustness and efficiency
Abstract
Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method for Large Language Models (LLMs), but it still incurs notable overhead and suffers from parameter interference in complex datasets. While recent works decouple LoRA update matrices to exploit matrix-wise asymmetry, training costs remain high. We revisit LoRA from the perspective of inter-matrix and intra-layer parameter redundancy and propose Resource-Efficient Low-Rank Adaptation, EffiLoRA, a lightweight and generalizable approach for language, multimodal, and diffusion models. EffiLoRA employs a unified A matrix across all transformer layers and introduces a runtime selective B matrices update to dynamically trade-off the system resource budget and model performance. EffiLoRA consistently outperforms LoRA across diverse modalities, including commonsense reasoning, visual instruction tuning,…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
* good comparison against existing works * compared using current architectures (like llama-3) * proposed method is very parameter-efficient. * experiments cover several domains from LLMs to VLMs to diffusion models * includes training complexity/cost analysis which is favorable
* the methodname "ReLora" is already taken from a fairly well-cited work of 2023. published at iclr * Limited novelty. sharing weights across layers is not novel (e.g. HydraLoRA), using a MoE like router has also been used in LoRAMoE and other works. What remains as novelty seems to be the "reducer" part of the method which only aids training and does not have any effect on downstream use. * unclear hyperparameter selection. LoRA and all the PEFT methods are extremely dependend on correct and f
* *Clear motivation and solid empirical grounding*: The paper convincingly identifies inter-matrix and intra-layer redundancies in LoRA through well-designed empirical observations. * *Simple yet generalizable design*: The unified $A$+dynamic $B$ mechanism is conceptually elegant and applicable to various architectures (LLMs, MLLMs, diffusion models) without invasive changes. Strong empirical validation – Results are consistent across diverse modalities with detailed ablations (e.g., drop-ratio,
* *Incomplete resource-efficiency evaluation*: The “resource-efficient” claim mainly relies on parameter count and training time; other dimensions such as GPU memory footprint, FLOPs under varying K, or energy consumption are not analyzed. What's more, only three model is adopted. * *Generalization explanation*: The paper attributes performance gains to shared-A capturing “global knowledge,” but lacks deeper analysis (e.g., probing studies or representation overlap) to validate this hypothesis.
+ The general idea to *general idea is to sharing one matrix across layers and pruning only a subset of adapters per step with a small router.* is pratical and straight-foward. + Clear improvement over the baselines. + Overall this paper is easy to follow. + This paper covers the improvement across multi-modalities.
- The technique novelties of the proposed method remains limited. Specifically, prior work (e.g., AdaLoRA, DyLoRA) has already shared or frozen matrix to remove inter-matrix redundancy and shown dynamic or adaptive budgets across layers. - The compared state-of-the-art PEFT methods are significantly missing. Some more recent and much stronger PEFT methods are mssing for comparison, for example: [1] DoRA: Weight-Decomposed Low-Rank Adaptation. ICML 2024. [2] VeRA: Vector-based Random Matrix Ad
- Superior Parameter Efficiency via Asymmetric Architecture: The paper introduces a unified asymmetric architecture that effectively tackles parameter redundancy. By employing a single, globally shared A matrix across all layers, it captures common, generalizable knowledge , while using specialized, layer-specific B matrices to learn fine-grained task details. This design drastically reduces the total number of trainable parameters compared to standard LoRA and its variants, achieving SOTA resul
- `Limited Innovation`: The ReLoRA method is overly simplistic, and its core motivation ($i.e.$, low-resource LoRA research [1-3]) has already been extensively explored. - `Limited Presentation`: Figure 4 fails to adequately explain the proposed ReLoRA. It is highly confusing whether the number of B matrices is singular, multiple per layer, or determined by the single-task vs. multi-task experimental setup. Furthermore, the mechanism of the importance score vector is unclear; how importance is j
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
