A Mirror Descent Perspective of Smoothed Sign Descent
Shuyang Wang, Diego Klabjan

TL;DR
This paper extends the mirror descent framework to analyze smoothed sign descent algorithms, revealing how tuning stability constants influences convergence and solution quality in overparameterized regression problems.
Contribution
It introduces a mirror map for smoothed sign descent, linking its dynamics to dual space and characterizing convergence as approximate KKT points.
Findings
Tuning the stability constant reduces KKT error.
The mirror map establishes equivalence to dual dynamics.
Convergence characterized as approximate KKT points.
Abstract
Recent work by Woodworth et al. (2020) shows that the optimization dynamics of gradient descent for overparameterized problems can be viewed as low-dimensional dual dynamics induced by a mirror map, explaining the implicit regularization phenomenon from the mirror descent perspective. However, the methodology does not apply to algorithms where update directions deviate from true gradients, such as ADAM. We use the mirror descent framework to study the dynamics of smoothed sign descent with a stability constant for regression problems. We propose a mirror map that establishes equivalence to dual dynamics under some assumptions. By studying dual dynamics, we characterize the convergent solution as an approximate KKT point of minimizing a Bregman divergence style function, and show the benefit of tuning the stability constant to reduce the KKT error.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques
