Stochastic Bregman Subgradient Methods for Nonsmooth Nonconvex Optimization Problems
Kuangyu Ding, Kim-Chuan Toh

TL;DR
This paper introduces stochastic Bregman subgradient methods for nonsmooth, nonconvex optimization, providing convergence analysis, practical inexact solutions, momentum integration, and applications to neural network training.
Contribution
It develops a comprehensive framework for stochastic Bregman subgradient methods, including inexact subproblem solutions, momentum incorporation, and convergence guarantees for nonsmooth nonconvex problems.
Findings
Methods effectively train nonsmooth neural networks
Convergence is established via differential inclusion
Numerical experiments validate approach effectiveness
Abstract
This paper focuses on the problem of minimizing a locally Lipschitz continuous function. Motivated by the effectiveness of Bregman gradient methods in training nonsmooth deep neural networks and the recent progress in stochastic subgradient methods for nonsmooth nonconvex optimization problems \cite{bolte2021conservative,bolte2022subgradient,xiao2023adam}, we investigate the long-term behavior of stochastic Bregman subgradient methods in such context, especially when the objective function lacks Clarke regularity. We begin by exploring a general framework for Bregman-type methods, establishing their convergence by a differential inclusion approach. For practical applications, we develop a stochastic Bregman subgradient method that allows the subproblems to be solved inexactly. Furthermore, we demonstrate how a single timescale momentum can be integrated into the Bregman subgradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Variational Analysis · Stochastic Gradient Optimization Techniques
