AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption
Yajie Zhou, Xiaoyi Pang, Zhibo Wang

TL;DR
AFLoRA is a novel federated fine-tuning framework for large language models that improves efficiency and accuracy by adaptively managing resource constraints and data heterogeneity across clients.
Contribution
It introduces an adaptive, resource-aware federated fine-tuning method that decouples updates, uses rank pruning, and employs rank-aware aggregation for better performance.
Findings
Outperforms state-of-the-art methods in accuracy.
Reduces communication and computation overhead.
Enhances generalization under data heterogeneity.
Abstract
Federated fine-tuning has emerged as a promising approach to adapt foundation models to downstream tasks using decentralized data. However, real-world deployment remains challenging due to the high computational and communication demands of fine-tuning Large Language Models (LLMs) on clients with data and system resources that are heterogeneous and constrained. In such settings, the global model's performance is often bottlenecked by the weakest clients and further degraded by the non-IID nature of local data. Although existing methods leverage parameter-efficient techniques such as Low-Rank Adaptation (LoRA) to reduce communication and computation overhead, they often fail to simultaneously ensure accurate aggregation of low-rank updates and maintain low system costs, thereby hindering overall performance. To address these challenges, we propose AFLoRA, an adaptive and lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
MethodsPruning
