On the Rate of Convergence of GD in Non-linear Neural Networks: An Adversarial Robustness Perspective
Guy Smorodinsky, Sveta Gimpleson, Itay Safran

TL;DR
This paper analyzes the convergence rate of Gradient Descent in a simple non-linear neural network, revealing a slow logarithmic rate for robustness margin maximization, supported by theoretical proofs and empirical validation.
Contribution
It provides the first explicit lower bound on the convergence rate of the robustness margin in a non-linear neural network setting.
Findings
GD converges to the robustness margin at a rate of Θ(1/ln(t))
The slow convergence rate is consistent across different initializations
The analysis applies to a minimal two-neuron ReLU network with two training points
Abstract
We study the convergence dynamics of Gradient Descent (GD) in a minimal binary classification setting, consisting of a two-neuron ReLU network and two training instances. We prove that even under these strong simplifying assumptions, while GD successfully converges to an optimal robustness margin, effectively maximizing the distance between the decision boundary and the training points, this convergence occurs at a prohibitively slow rate, scaling strictly as . To the best of our knowledge, this establishes the first explicit lower bound on the convergence rate of the robustness margin in a non-linear model. Through empirical simulations, we further demonstrate that this inherent failure mode is pervasive, exhibiting the exact same tight convergence rate across multiple natural network initializations. Our theoretical guarantees are derived via a rigorous analysis of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
