Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach
Challapalli Phanindra Revanth, Sumohana S. Channappayya, C Krishna, Mohan

TL;DR
This paper introduces GradSamp, a Gaussian sampling method that reduces energy consumption in deep learning training by approximating gradients and enabling epoch skipping, validated across various models and tasks.
Contribution
GradSamp is a novel gradient approximation technique leveraging Gaussian sampling to improve training efficiency and energy savings in deep learning models.
Findings
Achieves significant energy reduction without performance loss.
Effective across CNNs, transformers, and diverse tasks.
Applicable in out-of-distribution and decentralized scenarios.
Abstract
Computing the loss gradient via backpropagation consumes considerable energy during deep learning (DL) model training. In this paper, we propose a novel approach to efficiently compute DL models' gradients to mitigate the substantial energy overhead associated with backpropagation. Exploiting the over-parameterized nature of DL models and the smoothness of their loss landscapes, we propose a method called {\em GradSamp} for sampling gradient updates from a Gaussian distribution. Specifically, we update model parameters at a given epoch (chosen periodically or randomly) by perturbing the parameters (element-wise) from the previous epoch with Gaussian ``noise''. The parameters of the Gaussian distribution are estimated using the error between the model parameter values from the two previous epochs. {\em GradSamp} not only streamlines gradient computation but also enables skipping entire…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference
MethodsSparse Evolutionary Training
