An Energy-Based Self-Adaptive Learning Rate for Stochastic Gradient Descent: Enhancing Unconstrained Optimization with VAV method
Jiahao Zhang, Christian Moya, and Guang Lin

TL;DR
This paper introduces the VAV method, an energy-based self-adaptive learning rate algorithm for SGD that improves stability and convergence speed by utilizing an auxiliary variable to approximate energy without backtracking.
Contribution
The VAV algorithm is a novel energy-based self-adaptive learning rate method that guarantees energy dissipation and convergence, outperforming traditional SGD in various tasks.
Findings
VAV achieves faster convergence in early training stages.
VAV demonstrates superior stability with larger learning rates.
The auxiliary variable $r$ effectively bounds training loss.
Abstract
Optimizing the learning rate remains a critical challenge in machine learning, essential for achieving model stability and efficient convergence. The Vector Auxiliary Variable (VAV) algorithm introduces a novel energy-based self-adjustable learning rate optimization method designed for unconstrained optimization problems. It incorporates an auxiliary variable to facilitate efficient energy approximation without backtracking while adhering to the unconditional energy dissipation law. Notably, VAV demonstrates superior stability with larger learning rates and achieves faster convergence in the early stage of the training process. Comparative analyses demonstrate that VAV outperforms Stochastic Gradient Descent (SGD) across various tasks. This paper also provides rigorous proof of the energy dissipation law and establishes the convergence of the algorithm under reasonable assumptions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications
