Why Smooth Stability Assumptions Fail for ReLU Learning
Ronald Katende

TL;DR
This paper demonstrates that classical smoothness-based stability analyses fail for ReLU neural networks, providing counterexamples and proposing a generalized derivative condition to better understand stability in nonsmooth settings.
Contribution
It identifies the fundamental limitations of smoothness assumptions in ReLU networks and introduces a minimal generalized derivative condition to restore meaningful stability analysis.
Findings
Classical stability bounds do not hold for ReLU networks.
Counterexamples show failure of gradient Lipschitzness in simple settings.
A generalized derivative condition can restore stability insights.
Abstract
Stability analyses of modern learning systems are frequently derived under smoothness assumptions that are violated by ReLU-type nonlinearities. In this note, we isolate a minimal obstruction by showing that no uniform smoothness-based stability proxy such as gradient Lipschitzness or Hessian control can hold globally for ReLU networks, even in simple settings where training trajectories appear empirically stable. We give a concrete counterexample demonstrating the failure of classical stability bounds and identify a minimal generalized derivative condition under which stability statements can be meaningfully restored. The result clarifies why smooth approximations of ReLU can be misleading and motivates nonsmooth-aware stability frameworks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks
