ASLoRA: Adaptive Sharing Low-Rank Adaptation Across Layers

Junyan Hu; Xue Xiao; Mengqi Zhang; Yao Chen; Zhaochun Ren; Zhumin; Chen; Pengjie Ren

arXiv:2412.10135·cs.CL·December 17, 2024

ASLoRA: Adaptive Sharing Low-Rank Adaptation Across Layers

Junyan Hu, Xue Xiao, Mengqi Zhang, Yao Chen, Zhaochun Ren, Zhumin, Chen, Pengjie Ren

PDF

Open Access

TL;DR

ASLoRA introduces a novel cross-layer parameter-sharing strategy for low-rank adaptation in large language models, improving efficiency and performance by sharing and adaptively merging parameters across layers.

Contribution

It proposes a new adaptive sharing mechanism combining global and partial sharing, enhancing parameter efficiency and model flexibility beyond existing methods like LoRA.

Findings

01

Outperforms LoRA on various NLP tasks.

02

Uses less than 25% of parameters compared to full fine-tuning.

03

Enhances model flexibility and task adaptability.

Abstract

As large language models (LLMs) grow in size, traditional full fine-tuning becomes increasingly impractical due to its high computational and storage costs. Although popular parameter-efficient fine-tuning methods, such as LoRA, have significantly reduced the number of tunable parameters, there is still room for further optimization. In this work, we propose ASLoRA, a cross-layer parameter-sharing strategy combining global sharing with partial adaptive sharing. Specifically, we share the low-rank matrix A across all layers and adaptively merge matrix B during training. This sharing mechanism not only mitigates overfitting effectively but also captures inter-layer dependencies, significantly enhancing the model's representational capability. We conduct extensive experiments on various NLP tasks, showing that ASLoRA outperforms LoRA while using less than 25% of the parameters,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computing and Algorithms · CCD and CMOS Imaging Sensors · Embedded Systems Design Techniques