Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices
Mohamed Aboelenien Ahmed, Kilian Pfeiffer, Ramin Khalili, Heba Khdr, J\"org Henkel

TL;DR
This paper introduces a novel zero-order federated fine-tuning method for large language models on resource-constrained devices, reducing computation overhead and improving convergence speed while preserving data privacy.
Contribution
It proposes a network division approach with differential perturbations per block, achieving faster convergence and lower computational costs in federated LLM fine-tuning.
Findings
Achieves 1.6-3x reduction in computation overhead.
Enables faster convergence in federated fine-tuning.
Maintains inference-level memory requirements.
Abstract
Federated fine-tuning offers a promising approach for tuning Large Language Models (LLMs) on edge devices while preserving data privacy. However, fine-tuning these models on edge devices remains challenging due to high memory, communication, and computational demands. Zero-order optimization with task alignment provides a potential solution, enabling fine-tuning with inference-level memory requirements but requires a longer convergence time. In this paper, we propose \ac{METHOD} that divides the network into two blocks, applying a different number of perturbations per block in a computationally effective way, achieving faster convergence. Our evaluation shows a reduction in computation overhead compared to zero-order state of the art techniques in federated learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
