Examples of pathological dynamics of the subgradient method for Lipschitz path-differentiable functions
Rodolfo Rios-Zertuche

TL;DR
This paper demonstrates that the subgradient method can exhibit complex and undesirable behaviors even under favorable conditions, challenging assumptions about its convergence and stability in machine learning.
Contribution
It reveals pathological dynamics of the subgradient method on Lipschitz path-differentiable functions, including failure of convergence and criticality under common assumptions.
Findings
Subgradient sequences may fail to converge even with favorable assumptions.
Various properties like criticality and convergence can all fail for path-differentiable functions.
The behavior includes oscillations, slowdown, and complex accumulation sets.
Abstract
We show that the vanishing stepsize subgradient method -- widely adopted for machine learning applications -- can display rather messy behavior even in the presence of favorable assumptions. We establish that convergence of bounded subgradient sequences may fail even with a Whitney stratifiable objective function satisfying the Kurdyka-Lojasiewicz inequality. Moreover, when the objective function is path-differentiable we show that various properties all may fail to occur: criticality of the limit points, convergence of the sequence, convergence in values, codimension one of the accumulation set, equality of the accumulation and essential accumulation sets, connectedness of the essential accumulation set, spontaneous slowdown, oscillation compensation, and oscillation perpendicularity to the accumulation set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
