Wide and Deep Neural Networks Achieve Optimality for Classification
Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

TL;DR
This paper demonstrates that certain wide and deep neural networks can be explicitly constructed to achieve optimal classification, connecting neural tangent kernels, activation functions, and classical classifiers.
Contribution
The authors identify explicit activation functions for infinitely wide and deep neural networks that achieve optimal classification, providing a taxonomy linking network architecture to classical classifiers.
Findings
Networks can implement 1-nearest neighbor, majority vote, or optimal kernel classifiers.
Explicit activation functions differ from common ones like ReLU or sigmoid.
Deep networks offer benefits for classification over regression tasks.
Abstract
While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are optimal for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that achieve optimality. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and Neural Tangent Kernels, we provide explicit activation functions that can be used to construct networks that achieve optimality. Interestingly, these activation functions are simple and easy to implement, yet differ from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
