Oscars: Adaptive Semi-Synchronous Parallel Model for Distributed Deep Learning with Global View
Sheng Huang

TL;DR
This paper introduces an adaptive semi-synchronous training strategy for distributed deep learning that balances speed and accuracy, especially in heterogeneous clusters, by reducing communication overhead and improving resource utilization.
Contribution
It proposes a novel semi-synchronous training method based on local-SDG that enhances efficiency and accuracy in heterogeneous distributed deep learning environments.
Findings
Reduces communication overhead in distributed training.
Improves resource utilization in heterogeneous clusters.
Maintains model accuracy with faster training speed.
Abstract
Deep learning has become an indispensable part of life, such as face recognition, NLP, etc., but the training of deep model has always been a challenge, and in recent years, the complexity of training data and models has shown explosive growth, so the training method is gradually transformed into distributed training. Classical synchronization strategy can guarantee accuracy but frequent communication can lead to a slow training speed, although asynchronous strategy training speed but can not guarantee the accuracy, and in the face of the training of the heterogeneous cluster, the above work is not efficient to work, on the one hand, can cause serious waste of resources, on the other hand, frequent communication also made slow training speed, so this paper proposes a semi-synchronous training strategy based on local-SDG, effectively improve the utilization efficiency of heterogeneous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
