Non-smooth stochastic gradient descent using smoothing functions

Tommaso Giovannelli; Jingfu Tan; Luis Nunes Vicente

arXiv:2507.10901·math.OC·May 15, 2026

Non-smooth stochastic gradient descent using smoothing functions

Tommaso Giovannelli, Jingfu Tan, Luis Nunes Vicente

PDF

TL;DR

This paper introduces a smoothing stochastic gradient method for non-smooth compositional optimization, providing convergence guarantees and rates across convex, strongly convex, and non-convex settings.

Contribution

It proposes a novel smoothing-based stochastic gradient approach with proven convergence rates for non-smooth compositional problems in machine learning.

Findings

01

Achieves a 1/T^(1/4) convergence rate for convex objectives.

02

Provides convergence guarantees in strongly convex and non-convex settings.

03

Preliminary results suggest competitiveness of the method on certain problems.

Abstract

In this paper, we address stochastic optimization problems involving a composition of a non-smooth outer function and a smooth inner function, a formulation frequently encountered in machine learning and operations research. To deal with the non-differentiability of the outer function, we approximate the original non-smooth function using smoothing functions, which are continuously differentiable and approach the original function as a smoothing parameter goes to zero (at the price of increasingly higher Lipschitz constants). The proposed smoothing stochastic gradient method iteratively drives the smoothing parameter to zero at a designated rate. We establish convergence guarantees under strongly convex, convex, and non-convex settings, proving convergence rates that match known results for non-smooth stochastic compositional optimization. In particular, for convex objectives, smoothing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.