The Unified Balance Theory of Second-Moment Exponential Scaling Optimizers in Visual Tasks
Gongyue Zhang, Honghai Liu

TL;DR
This paper introduces a unified framework for first-order optimizers in visual tasks using variable Second-Moment Exponential Scaling, addressing issues like gradient vanishing and dataset sparsity through a new balance theory.
Contribution
It proposes a novel unification of SGD and adaptive optimizers via variable exponential scaling, grounded in a new balance theory for optimization.
Findings
Different balance coefficients significantly affect training dynamics
The unified approach improves optimization stability
Experimental results confirm the theory's effectiveness
Abstract
We have identified a potential method for unifying first-order optimizers through the use of variable Second-Moment Exponential Scaling(SMES). We begin with back propagation, addressing classic phenomena such as gradient vanishing and explosion, as well as issues related to dataset sparsity, and introduce the theory of balance in optimization. Through this theory, we suggest that SGD and adaptive optimizers can be unified under a broader inference, employing variable moving exponential scaling to achieve a balanced approach within a generalized formula for first-order optimizers. We conducted tests on some classic datasets and networks to confirm the impact of different balance coefficients on the overall training process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual perception and processing mechanisms · Data Visualization and Analytics
MethodsStochastic Gradient Descent
