Stabilizing Decentralized Federated Fine-Tuning via Topology-Aware Alternating LoRA
Xiaoyu Wang, Xiaotian Li, Zhixiang Zhou, Chen Li, and Yong Liu

TL;DR
This paper introduces TAD-LoRA, a topology-aware framework for decentralized federated fine-tuning that stabilizes training by managing LoRA update interactions across dynamic communication graphs.
Contribution
It proposes a novel topology-aware method for decentralized LoRA fine-tuning, with theoretical convergence guarantees and practical robustness across various network conditions.
Findings
TAD-LoRA stabilizes training in decentralized settings.
It performs well across different communication topologies.
Strong results on the MNLI dataset demonstrate effectiveness.
Abstract
Decentralized federated learning (DFL), a serverless variant of federated learning, poses unique challenges for parameter-efficient fine-tuning due to the factorized structure of low-rank adaptation (LoRA). Unlike linear parameters, decentralized aggregation of LoRA updates introduces topology-dependent cross terms that can destabilize training under dynamic communication graphs. We propose \texttt{TAD-LoRA}, a Topology-Aware Decentralized Low-Rank Adaptation framework that coordinates the updates and mixing of LoRA factors to control inter-client misalignment. We theoretically prove the convergence of \texttt{TAD-LoRA} under non-convex objectives, explicitly characterizing the trade-off between topology-induced cross-term error and block-coordinate representation bias governed by the switching interval of alternative training. Experiments under various communication conditions validate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning
