Gradient Correction beyond Gradient Descent

Zefan Li; Bingbing Ni; Teng Li; WenJun Zhang; Wen Gao

arXiv:2203.08345·cs.LG·May 29, 2023

Gradient Correction beyond Gradient Descent

Zefan Li, Bingbing Ni, Teng Li, WenJun Zhang, Wen Gao

PDF

Open Access

TL;DR

This paper introduces GCGD, a framework for gradient correction that enhances gradient quality beyond standard gradient descent, leading to faster training and improved neural network performance.

Contribution

The paper proposes a novel gradient correction framework with two plug-in modules, GC-W and GC-ODE, to improve gradient quality beyond traditional gradient descent methods.

Findings

01

Reduces training epochs by approximately 20%

02

Improves neural network performance

03

Effectively enhances gradient quality

Abstract

The great success neural networks have achieved is inseparable from the application of gradient-descent (GD) algorithms. Based on GD, many variant algorithms have emerged to improve the GD optimization process. The gradient for back-propagation is apparently the most crucial aspect for the training of a neural network. The quality of the calculated gradient can be affected by multiple aspects, e.g., noisy data, calculation error, algorithm limitation, and so on. To reveal gradient information beyond gradient descent, we introduce a framework (\textbf{GCGD}) to perform gradient correction. GCGD consists of two plug-in modules: 1) inspired by the idea of gradient prediction, we propose a \textbf{GC-W} module for weight gradient correction; 2) based on Neural ODE, we propose a \textbf{GC-ODE} module for hidden states gradient correction. Experiment results show that our gradient correction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and ELM