Subgradient sampling for nonsmooth nonconvex minimization

J\'er\^ome Bolte (TSE-R); Tam Le (TSE-R); Edouard Pauwels (IRIT-ADRIA)

arXiv:2202.13744·math.OC·July 24, 2024·SIAM J. Optim.

Subgradient sampling for nonsmooth nonconvex minimization

J\'er\^ome Bolte (TSE-R), Tam Le (TSE-R), Edouard Pauwels (IRIT-ADRIA)

PDF

Open Access

TL;DR

This paper proves convergence of subgradient sampling methods for nonsmooth, nonconvex risk minimization, showing they avoid artificial critical points and are applicable to deep learning scenarios.

Contribution

It introduces a new convergence analysis using conservative calculus and ODE methods, improving prior results and extending applicability to deep learning.

Findings

01

Convergence established in path-differentiable cases.

02

Subgradient sampling avoids artificial critical points with probability one.

03

Applicable to a wide range of deep learning risk minimization problems.

Abstract

Risk minimization for nonsmooth nonconvex problems naturally leads to first-order sampling or, by an abuse of terminology, to stochastic subgradient descent. We establish the convergence of this method in the path-differentiable case and describe more precise results under additional geometric assumptions. We recover and improve results from Ermoliev and Norkin [Cybern. Syst. Anal., 34 (1998), pp. 196--215] by using a different approach: conservative calculus and the ODE method. In the definable case, we show that first-order subgradient sampling avoids artificial critical points with probability one and applies moreover to a large range of risk minimization problems in deep learning, based on the backpropagation oracle. As byproducts of our approach, we obtain several results on integration of independent interest, such as an interchange result for conservative derivatives and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Risk and Portfolio Optimization · Optimization and Variational Analysis