Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA
Nuocheng Yang, Sihua Wang, Ouwen Huan, Mingzhe Chen, Tony Q. S. Quek, and Changchuan Yin

TL;DR
This paper introduces a decentralized federated learning method for fine-tuning large language models using sparse-and-orthogonal LoRA, addressing data heterogeneity, communication efficiency, and multi-task interference.
Contribution
It proposes a novel sparse-and-orthogonal LoRA technique, topology-aware aggregation, and an implicit MoE mechanism to improve federated multi-task LLM fine-tuning.
Findings
Reduces communication resource consumption by up to 73%
Improves average performance by 5% over traditional LoRA
Effectively mitigates knowledge interference during inference
Abstract
Decentralized federated learning (DFL) based on low-rank adaptation (LoRA) enables mobile devices with multi-task datasets to collaboratively fine-tune a large language model (LLM) by exchanging locally updated parameters with a subset of neighboring devices via wireless connections for knowledge integration.However, directly aggregating parameters fine-tuned on heterogeneous datasets induces three primary issues across the DFL life-cycle: (i) \textit{catastrophic knowledge forgetting during fine-tuning process}, arising from conflicting update directions caused by data heterogeneity; (ii) \textit{inefficient communication and convergence during model aggregation process}, due to bandwidth-intensive redundant model transmissions; and (iii) \textit{multi-task knowledge interference during inference process}, resulting from incompatible knowledge representations coexistence during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
