Quadratic Gradient: A Unified Framework Bridging Gradient Descent and Newton-Type Methods by Synthesizing Hessians and Gradients
John Chiang

TL;DR
This paper introduces a novel Quadratic Gradient framework that unifies gradient descent and Newton methods, offering improved convergence and applicability to deep learning by using Hessian and gradient synthesis.
Contribution
The paper proposes a new variant of the Quadratic Gradient that relaxes classical convergence constraints and demonstrates superior empirical performance in optimization tasks.
Findings
The new quadratic gradient variant sometimes outperforms the original in convergence rate.
Both variants are effective in non-convex optimization landscapes.
Hutchinson's Estimator efficiently estimates Hessian diagonals for deep learning applications.
Abstract
Accelerating the convergence of second-order optimization, particularly Newton-type methods, remains a pivotal challenge in algorithmic research. In this paper, we extend previous work on the \textbf{Quadratic Gradient (QG)} and rigorously validate its applicability to general convex numerical optimization problems. We introduce a novel variant of the Quadratic Gradient that departs from the conventional fixed Hessian Newton framework. We present a new way to build a new version of the quadratic gradient. This new quadratic gradient doesn't satisfy the convergence conditions of the fixed Hessian Newton's method. However, experimental results show that it sometimes has a better performance than the original one in convergence rate. While this variant relaxes certain classical convergence constraints, it maintains a positive-definite Hessian proxy and demonstrates comparable, or in some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
