CopRA: A Progressive LoRA Training Strategy

Zhan Zhuang; Xiequn Wang; Yulong Zhang; Wei Li; Yu Zhang; Ying Wei

arXiv:2410.22911·cs.LG·October 31, 2024

CopRA: A Progressive LoRA Training Strategy

Zhan Zhuang, Xiequn Wang, Yulong Zhang, Wei Li, Yu Zhang, Ying Wei

PDF

Open Access

TL;DR

CopRA introduces a progressive LoRA training strategy with layer dropping and Shapley value optimization, enhancing model merging, pruning, and out-of-distribution performance.

Contribution

This work proposes a novel progressive training method for LoRA that improves model merging, pruning, and out-of-distribution robustness through layer dropping and Shapley value optimization.

Findings

01

Parameters trained with CopRA exhibit linear mode connectivity.

02

CopRA enables efficient model merging for federated and multi-task learning.

03

Optimizing Shapley values improves pruning performance.

Abstract

Low-Rank Adaptation (LoRA) is a parameter-efficient technique for rapidly fine-tuning foundation models. In standard LoRA training dynamics, models tend to quickly converge to a local optimum near the initialization. However, this local optimum may not be ideal for out-of-distribution data or tasks such as merging and pruning. In this work, we propose a novel progressive training strategy for LoRA with random layer dropping. This strategy also optimizes the Shapley value of LoRA parameters in each layer, treating each layer as a player in a cooperative game. We refer to this method as Cooperative LoRA (CopRA). Our experimental results demonstrate that parameters trained with CopRA exhibit linear mode connectivity, which enables efficient model merging. This also paves the way for federated learning and multi-task learning via LoRA merging. Additionally, by optimizing the Shapley value,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems

MethodsPruning