Can neural networks extrapolate? Discussion of a theorem by Pedro   Domingos

Adrien Courtois; Jean-Michel Morel; Pablo Arias

arXiv:2211.03566·cs.CV·November 8, 2022

Can neural networks extrapolate? Discussion of a theorem by Pedro Domingos

Adrien Courtois, Jean-Michel Morel, Pablo Arias

PDF

Open Access

TL;DR

This paper discusses a theorem by Pedro Domingos that suggests neural networks trained with gradient descent are essentially kernel machines, limiting their extrapolation abilities, especially as task complexity increases.

Contribution

The paper extends Domingos' theorem to discrete cases and vector outputs, analyzing its implications for neural network interpolation capabilities.

Findings

01

Kernel interpretation explains neural network predictions in simple cases.

02

Network extrapolation is limited by the kernel nature as task complexity grows.

03

The theorem's relevance is demonstrated on shape recovery from boundary data.

Abstract

Neural networks trained on large datasets by minimizing a loss have become the state-of-the-art approach for resolving data science problems, particularly in computer vision, image processing and natural language processing. In spite of their striking results, our theoretical understanding about how neural networks operate is limited. In particular, what are the interpolation capabilities of trained neural networks? In this paper we discuss a theorem of Domingos stating that "every machine learned by continuous gradient descent is approximately a kernel machine". According to Domingos, this fact leads to conclude that all machines trained on data are mere kernel machines. We first extend Domingo's result in the discrete case and to networks with vector-valued output. We then study its relevance and significance on simple examples. We find that in simple cases, the "neural tangent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference