ssProp: Energy-Efficient Training for Convolutional Neural Networks with   Scheduled Sparse Back Propagation

Lujia Zhong; Shuo Huang; Yonggang Shi

arXiv:2408.12561·cs.LG·December 31, 2024

ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation

Lujia Zhong, Shuo Huang, Yonggang Shi

PDF

Open Access 1 Repo

TL;DR

ssProp introduces a novel energy-efficient training method for CNNs that employs scheduled sparse backpropagation, significantly reducing computation and energy consumption while maintaining or improving model performance across various tasks.

Contribution

The paper proposes a general, energy-efficient convolution module with channel-wise sparsity and gradient schedulers, enabling sparse backpropagation across diverse architectures and tasks.

Findings

01

Reduces 40% of computations during training.

02

Potentially improves model performance by mitigating over-fitting.

03

Compatible with various datasets and deep learning architectures.

Abstract

Recently, deep learning has made remarkable strides, especially with generative modeling, such as large language models and probabilistic diffusion models. However, training these models often involves significant computational resources, requiring billions of petaFLOPs. This high resource consumption results in substantial energy usage and a large carbon footprint, raising critical environmental concerns. Back-propagation (BP) is a major source of computational expense during training deep learning models. To advance research on energy-efficient training and allow for sparse learning on any machine and device, we propose a general, energy-efficient convolution module that can be seamlessly integrated into any deep learning architecture. Specifically, we introduce channel-wise sparsity with additional gradient selection schedulers during backward based on the assumption that BP is often…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lujiazho/ssprop
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and ELM

MethodsConvolution · Dropout · Diffusion