Optimization-Induced Dynamics of Lipschitz Continuity in Neural Networks

R\'ois\'in Luo; James McDermott; Christian Gagn\'e; Qiang Sun; Colm O'Riordan

arXiv:2506.18588·cs.LG·November 17, 2025

Optimization-Induced Dynamics of Lipschitz Continuity in Neural Networks

R\'ois\'in Luo, James McDermott, Christian Gagn\'e, Qiang Sun, Colm O'Riordan

PDF

TL;DR

This paper develops a mathematical framework using stochastic differential equations to analyze how Lipschitz continuity in neural networks evolves during training, influenced by optimization dynamics and stochastic noise.

Contribution

It introduces a novel SDE-based model capturing the dynamics of Lipschitz continuity during training, highlighting key factors affecting its evolution.

Findings

01

The framework accurately predicts Lipschitz evolution during training.

02

Gradient projection onto Jacobian influences Lipschitz changes.

03

Gradient noise impacts Lipschitz continuity through Hessian projections.

Abstract

Lipschitz continuity characterizes the worst-case sensitivity of neural networks to small input perturbations; yet its dynamics (i.e. temporal evolution) during training remains under-explored. We present a rigorous mathematical framework to model the temporal evolution of Lipschitz continuity during training with stochastic gradient descent (SGD). This framework leverages a system of stochastic differential equations (SDEs) to capture both deterministic and stochastic forces. Our theoretical analysis identifies three principal factors driving the evolution: (i) the projection of gradient flows, induced by the optimization dynamics, onto the operator-norm Jacobian of parameter matrices; (ii) the projection of gradient noise, arising from the randomness in mini-batch sampling, onto the operator-norm Jacobian; and (iii) the projection of the gradient noise onto the operator-norm Hessian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.