Train Where the Data is: A Case for Bandwidth Efficient Coded Training
Zhifeng Lin, Krishna Giri Narra, Mingchao Yu, Salman Avestimehr,, Murali Annavaram

TL;DR
This paper proposes a bandwidth-efficient coded training method using Random Linear Network Coding for mobile devices, enabling uncertainty tolerance and reducing communication bandwidth by 50% compared to traditional erasure codes.
Contribution
It introduces a novel RLNC-based encoding strategy for mobile training that minimizes bandwidth and handles device uncertainties effectively.
Findings
Achieves 50% reduction in communication bandwidth over MDS codes.
Successfully implements gradient descent for logistic regression and SVM.
Demonstrates robustness to device join/leave uncertainties.
Abstract
Training a machine learning model is both compute and data-intensive. Most of the model training is performed on high performance compute nodes and the training data is stored near these nodes for faster training. But there is a growing interest in enabling training near the data. For instance, mobile devices are rich sources of training data. It may not be feasible to consolidate the data from mobile devices into a cloud service, due to bandwidth and data privacy reasons. Training at mobile devices is however fraught with challenges. First mobile devices may join or leave the distributed setting, either voluntarily or due to environmental uncertainties, such as lack of power. Tolerating uncertainties is critical to the success of distributed mobile training. One proactive approach to tolerate computational uncertainty is to store data in a coded format and perform training on coded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Machine Learning and Algorithms · Algorithms and Data Compression
MethodsLogistic Regression · Support Vector Machine
