Why Smooth Stability Assumptions Fail for ReLU Learning

Ronald Katende

arXiv:2512.22055·cs.LG·December 29, 2025

Why Smooth Stability Assumptions Fail for ReLU Learning

Ronald Katende

PDF

Open Access

TL;DR

This paper demonstrates that classical smoothness-based stability analyses fail for ReLU neural networks, providing counterexamples and proposing a generalized derivative condition to better understand stability in nonsmooth settings.

Contribution

It identifies the fundamental limitations of smoothness assumptions in ReLU networks and introduces a minimal generalized derivative condition to restore meaningful stability analysis.

Findings

01

Classical stability bounds do not hold for ReLU networks.

02

Counterexamples show failure of gradient Lipschitzness in simple settings.

03

A generalized derivative condition can restore stability insights.

Abstract

Stability analyses of modern learning systems are frequently derived under smoothness assumptions that are violated by ReLU-type nonlinearities. In this note, we isolate a minimal obstruction by showing that no uniform smoothness-based stability proxy such as gradient Lipschitzness or Hessian control can hold globally for ReLU networks, even in simple settings where training trajectories appear empirically stable. We give a concrete counterexample demonstrating the failure of classical stability bounds and identify a minimal generalized derivative condition under which stability statements can be meaningfully restored. The result clarifies why smooth approximations of ReLU can be misleading and motivates nonsmooth-aware stability frameworks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks