Selective Aggregation for Low-Rank Adaptation in Federated Learning
Pengxin Guo, Shuang Zeng, Yanran Wang, Huijie Fan, Feifei Wang,, Liangqiong Qu

TL;DR
This paper introduces FedSA-LoRA, a federated learning method that selectively shares low-rank matrices to improve efficiency and performance, based on the distinct roles of matrices in learning general versus client-specific knowledge.
Contribution
The paper proposes FedSA-LoRA, a novel approach that shares only the A matrices in LoRA during federated learning, and extends this paradigm to other LoRA variants, enhancing efficiency and understanding.
Findings
FedSA-LoRA outperforms traditional methods in natural language tasks.
Selective sharing of A matrices maintains model performance while reducing communication.
The approach generalizes well across different LoRA variants.
Abstract
We investigate LoRA in federated learning through the lens of the asymmetry analysis of the learned and matrices. In doing so, we uncover that matrices are responsible for learning general knowledge, while matrices focus on capturing client-specific knowledge. Based on this finding, we introduce Federated Share-A Low-Rank Adaptation (FedSA-LoRA), which employs two low-rank trainable matrices and to model the weight update, but only matrices are shared with the server for aggregation. Moreover, we delve into the relationship between the learned and matrices in other LoRA variants, such as rsLoRA and VeRA, revealing a consistent pattern. Consequently, we extend our FedSA-LoRA method to these LoRA variants, resulting in FedSA-rsLoRA and FedSA-VeRA. In this way, we establish a general paradigm for integrating LoRA with FL, offering guidance for future…
Peer Reviews
Decision·ICLR 2025 Poster
This paper has the interesting motivation and shows the different roles of low-rank matrices $A$ and $B$ in federated fine-tuning.
I have the following concerns: 1. Lemma 1 is very interesting to me and the verified experiments show that all the matrices $A_i$ ($i$ is the client index) seems to be the same and $B_i$ differs with each other. We know that different client has independent initialization of $A$, but finally with the exactly same $A$ since the similarity of $A$ from clients is $1.0$ in Figure 2. As you mentioned in Figure 3 in Appendix that the learned A matrices are different from the initialized A matrices, s
1. The paper is generally well-motivated and easy to follow. 2. This work introduces the shared-A LoRA technique into the FL framework, making it promising compared to previous work where the down-projection matrix (A matrix) is kept frozen. 3. Comprehensive experimental results are provided.
1. While this work introduces a shared-A LoRA framework into FL, such a technique has already been introduced in the MoE area [1] under similar motivations. Although the novelty of introducing it into FL should be recognized, the introduction of such a framework cannot be solely credited to this work. 2. On page 20, Figure 3, the authors claim that "A matrices are different from the initialized A matrices, indicating that the A matrices are updated." However, the cosine similarity between the l
(1) The paper's analysis of the distinct roles of matrices A and B clearly demonstrates its insights and motivations. (2) The proposed method improves upon existing solutions and achieves certain performance enhancements.
(1) One of the significant contributions of the paper is the asymmetric analysis of matrices A and B, concluding that matrix A is responsible for general knowledge and matrix B for domain-specific knowledge. However, this perspective has already been proposed in several works, such as HydraLoRA [1], which also designed an asymmetric LoRA structure based on this concept. The paper should properly cite relevant literature, and these existing findings somewhat diminish the paper's contribution. (2
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Machine Learning and ELM · Traffic Prediction and Management Techniques
MethodsFocus
