Balance is Essence: Accelerating Sparse Training via Adaptive Gradient   Correction

Bowen Lei; Dongkuan Xu; Ruqi Zhang; Shuren He; Bani K. Mallick

arXiv:2301.03573·cs.LG·December 6, 2023

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

Bowen Lei, Dongkuan Xu, Ruqi Zhang, Shuren He, Bani K. Mallick

PDF

Open Access 1 Repo

TL;DR

This paper introduces an adaptive gradient correction method to accelerate and stabilize sparse neural network training, reducing training epochs and improving accuracy in resource-limited scenarios.

Contribution

We propose a novel adaptive gradient correction technique that improves convergence speed and stability of sparse training methods, applicable under standard and adversarial conditions.

Findings

01

Outperforms existing methods by up to 5.0% accuracy at the same epochs.

02

Reduces training epochs by up to 52.1% for the same accuracy.

03

Demonstrates effectiveness across multiple datasets, models, and sparsity levels.

Abstract

Despite impressive performance, deep neural networks require significant memory and computation costs, prohibiting their application in resource-constrained scenarios. Sparse training is one of the most common techniques to reduce these costs, however, the sparsity constraints add difficulty to the optimization, resulting in an increase in training time and instability. In this work, we aim to overcome this problem and achieve space-time co-efficiency. To accelerate and stabilize the convergence of sparse training, we analyze the gradient changes and develop an adaptive gradient correction method. Specifically, we approximate the correlation between the current and previous gradients, which is used to balance the two gradients to obtain a corrected gradient. Our method can be used with the most popular sparse training pipelines under both standard and adversarial setups. Theoretically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stevenboys/agent
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications