RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous   Environment via Submodular Partitioning

Haoze He; Parijat Dube

arXiv:2211.00839·cs.LG·September 20, 2023

RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous Environment via Submodular Partitioning

Haoze He, Parijat Dube

PDF

Open Access

TL;DR

This paper introduces RCD-SGD, a novel data partitioning framework using submodular optimization to improve distributed SGD in heterogeneous environments, addressing data heterogeneity and resource imbalance for faster training.

Contribution

The paper proposes a new resource-aware data partitioning algorithm for distributed SGD that explicitly considers device heterogeneity and maintains class balance.

Findings

01

Accelerates distributed training by up to 32%.

02

Effectively handles resource heterogeneity and data imbalance.

03

Improves convergence speed in heterogeneous environments.

Abstract

The convergence of SGD based distributed training algorithms is tied to the data distribution across workers. Standard partitioning techniques try to achieve equal-sized partitions with per-class population distribution in proportion to the total dataset. Partitions having the same overall population size or even the same number of samples per class may still have Non-IID distribution in the feature space. In heterogeneous computing environments, when devices have different computing capabilities, even-sized partitions across devices can lead to the straggler problem in distributed SGD. We develop a framework for distributed SGD in heterogeneous environments based on a novel data partitioning algorithm involving submodular optimization. Our data partitioning algorithm explicitly accounts for resource heterogeneity across workers while achieving similar class-level feature distribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM

MethodsStochastic Gradient Descent