Wide and Deep Neural Networks Achieve Optimality for Classification

Adityanarayanan Radhakrishnan; Mikhail Belkin; Caroline Uhler

arXiv:2204.14126·cs.LG·May 3, 2023

Wide and Deep Neural Networks Achieve Optimality for Classification

Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

PDF

Open Access

TL;DR

This paper demonstrates that certain wide and deep neural networks can be explicitly constructed to achieve optimal classification, connecting neural tangent kernels, activation functions, and classical classifiers.

Contribution

The authors identify explicit activation functions for infinitely wide and deep neural networks that achieve optimal classification, providing a taxonomy linking network architecture to classical classifiers.

Findings

01

Networks can implement 1-nearest neighbor, majority vote, or optimal kernel classifiers.

02

Explicit activation functions differ from common ones like ReLU or sigmoid.

03

Deep networks offer benefits for classification over regression tasks.

Abstract

While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are optimal for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that achieve optimality. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and Neural Tangent Kernels, we provide explicit activation functions that can be used to construct networks that achieve optimality. Interestingly, these activation functions are simple and easy to implement, yet differ from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification