FeedSign: Robust Full-parameter Federated Fine-tuning of Large Models with Extremely Low Communication Overhead of One Bit
Zhijie Cai, Haolong Chen, Guangxu Zhu

TL;DR
FeedSign is a federated fine-tuning method for large models that drastically reduces communication overhead to just 1 bit per exchange, while maintaining convergence speed and robustness.
Contribution
FeedSign introduces a novel 1-bit communication-efficient federated fine-tuning algorithm using zeroth-order optimization and shared pseudo-random generators.
Findings
Achieves exponential convergence rate of e^{-t} under standard assumptions.
Performs better or comparable to existing methods across models from 11M to 13B parameters.
Demonstrates robustness against data heterogeneity and Byzantine attacks.
Abstract
Federated fine-tuning (FFT) attempts to fine-tune a pre-trained model with private data from distributed clients by exchanging models rather than data under the orchestration of a parameter server (PS). To overcome the bottleneck forged by the growing communication and memory overhead for clients in such systems due to the growing model sizes, we propose \textit{FeedSign}, an FFT algorithm in which the upload and download payload for an aggregation step is exactly bit per step, while the memory overhead is squeezed to the amount needed for inference. This is realized by utilizing zeroth-order (ZO) optimizers on large models and shared pseudo-random number generators (PRNG) across devices to represent the gradient estimates as seed-sign pairs. We conduct theoretical analysis on FeedSign and show that it converges at an exponential rate , where is the number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Parallel Computing and Optimization Techniques · Neural Networks and Applications
