On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruslan Salakhutdinov,, Ruosong Wang

TL;DR
This paper introduces an efficient exact algorithm and GPU implementation for the Convolutional Neural Tangent Kernel (CNTK), enabling kernel-based classification of infinite-width CNNs with performance close to finite deep nets on CIFAR-10.
Contribution
It provides the first efficient exact computation method for CNTK and demonstrates its high performance as a kernel method, bridging deep learning and kernel theory.
Findings
CNTK achieves 10% higher accuracy than previous kernel methods on CIFAR-10.
Performance of CNTK is within 6% of finite deep CNNs without batch normalization.
The paper offers the first non-asymptotic proof linking wide neural nets to NTK-based kernel regression.
Abstract
How well does a classic deep net architecture like AlexNet or VGG19 classify on a standard dataset such as CIFAR-10 when its width --- namely, number of channels in convolutional layers, and number of nodes in fully-connected internal layers --- is allowed to increase to infinity? Such questions have come to the forefront in the quest to theoretically understand deep learning and its mysteries about optimization and generalization. They also connect deep learning to notions such as Gaussian processes and kernels. A recent paper [Jacot et al., 2018] introduced the Neural Tangent Kernel (NTK) which captures the behavior of fully-connected deep nets in the infinite width limit trained by gradient descent; this object was implicit in some other recent papers. An attraction of such ideas is that a pure kernel-based method is used to capture the power of a fully-trained deep net of infinite…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications
MethodsNeural Tangent Kernel · 1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax
