Smart Gradient -- An Adaptive Technique for Improving Gradient   Estimation

Esmail Abdul Fattah; Janet Van Niekerk; Haavard Rue

arXiv:2106.07313·math.NA·June 9, 2022

Smart Gradient -- An Adaptive Technique for Improving Gradient Estimation

Esmail Abdul Fattah, Janet Van Niekerk, Haavard Rue

PDF

TL;DR

This paper introduces Smart Gradient, an adaptive method that enhances the accuracy of numerical gradient estimates in optimization algorithms by using coordinate transformations and historical descent directions, verified through extensive experiments.

Contribution

The paper presents a novel limited-memory technique that improves gradient estimation accuracy by leveraging coordinate transformations and past descent directions.

Findings

01

Enhanced gradient accuracy in optimization tasks

02

Effective in both test functions and real data applications

03

Implemented in R and C++ packages for practical use

Abstract

Computing the gradient of a function provides fundamental information about its behavior. This information is essential for several applications and algorithms across various fields. One common application that require gradients are optimization techniques such as stochastic gradient descent, Newton's method and trust region methods. However, these methods usually requires a numerical computation of the gradient at every iteration of the method which is prone to numerical errors. We propose a simple limited-memory technique for improving the accuracy of a numerically computed gradient in this gradient-based optimization framework by exploiting (1) a coordinate transformation of the gradient and (2) the history of previously taken descent directions. The method is verified empirically by extensive experimentation on both test functions and on real data applications. The proposed method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.