SeedFlood: A Step Toward Scalable Decentralized Training of LLMs
Jihun Kim, Namhoon Lee

TL;DR
SeedFlood introduces a novel decentralized training method for large language models that drastically reduces communication overhead by utilizing seed-reconstructible updates, enabling scalable training across complex networks.
Contribution
This paper presents SeedFlood, a new decentralized training approach that minimizes communication costs by exploiting seed-reconstructible updates, allowing scalable training of billion-parameter models.
Findings
Outperforms gossip-based methods in communication efficiency.
Achieves comparable results to first-order methods in large-scale training.
Enables training of models across hundreds of clients with minimal communication.
Abstract
This work presents a new approach to decentralized training-SeedFlood-designed to scale for large models across complex network topologies and achieve global consensus with minimal communication overhead. Traditional gossip-based methods suffer from message communication costs that grow with model size, while information decay over network hops renders global consensus inefficient. SeedFlood departs from these practices by exploiting the seed-reconstructible structure of zeroth-order updates and effectively making the messages near-zero in size, allowing them to be flooded to every client in the network. This mechanism makes communication overhead negligible and independent of model size, removing the primary scalability bottleneck in decentralized training. Consequently, SeedFlood enables training in regimes previously considered impractical, such as billion-parameter models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware-Defined Networks and 5G · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks
