Super-fast Rates of Convergence for Neural Network Classifiers under the Hard Margin Condition
Nathanael Tepakbong, Xiang Zhou, Ding-Xuan Zhou

TL;DR
This paper proves that deep neural network classifiers can achieve near-optimal fast convergence rates under low-noise and hard-margin conditions, with rates depending on the smoothness of the regression function.
Contribution
It establishes new excess risk bounds for DNN classifiers under Tsybakov's low-noise and hard-margin conditions, including a novel risk decomposition technique.
Findings
Achieves excess risk bounds of order n^{-eta} with eta close to 1 under low-noise conditions.
Attains arbitrarily fast rates under the hard-margin condition for certain activation functions.
Provides minimax lower bounds showing the optimality of these rates for q ≥ 2.
Abstract
We study the classical binary classification problem for hypothesis spaces of Deep Neural Networks (DNNs) under Tsybakov's low-noise condition with exponent , as well as its limit case , which we refer to as the \emph{hard margin condition}. We demonstrate that, for a wide range of commonly used activation functions (including but not limited to ReLU, LeakyReLU, ELU, CELU, SELU, Softplus, GELU, SiLU, Swish, Mish, and Softmax), DNN solutions to the empirical risk minimization (ERM) problem with square loss surrogate and penalty on the weights can achieve excess risk bounds of order for close to under the low-noise condition, and for arbitrarily large under the hard-margin condition, provided that the Bayes regression function satisfies a \emph{distribution-adapted smoothness}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
