ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation
Lujia Zhong, Shuo Huang, Yonggang Shi

TL;DR
ssProp introduces a novel energy-efficient training method for CNNs that employs scheduled sparse backpropagation, significantly reducing computation and energy consumption while maintaining or improving model performance across various tasks.
Contribution
The paper proposes a general, energy-efficient convolution module with channel-wise sparsity and gradient schedulers, enabling sparse backpropagation across diverse architectures and tasks.
Findings
Reduces 40% of computations during training.
Potentially improves model performance by mitigating over-fitting.
Compatible with various datasets and deep learning architectures.
Abstract
Recently, deep learning has made remarkable strides, especially with generative modeling, such as large language models and probabilistic diffusion models. However, training these models often involves significant computational resources, requiring billions of petaFLOPs. This high resource consumption results in substantial energy usage and a large carbon footprint, raising critical environmental concerns. Back-propagation (BP) is a major source of computational expense during training deep learning models. To advance research on energy-efficient training and allow for sparse learning on any machine and device, we propose a general, energy-efficient convolution module that can be seamlessly integrated into any deep learning architecture. Specifically, we introduce channel-wise sparsity with additional gradient selection schedulers during backward based on the assumption that BP is often…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and ELM
MethodsConvolution · Dropout · Diffusion
