Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero, Jerome Pesenti, Yann LeCun

TL;DR
This paper argues that in high-dimensional datasets, true interpolation is almost impossible, challenging common beliefs about how algorithms generalize and perform in such settings.
Contribution
It provides empirical and theoretical evidence that interpolation rarely occurs in high-dimensional data, questioning existing notions of generalization.
Findings
Interpolation almost never occurs in datasets with more than 100 dimensions.
Current definitions of interpolation/extrapolation may not accurately reflect generalization.
Theoretical models show high-dimensional data behave differently from low-dimensional intuition.
Abstract
The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample whenever this sample falls inside or on the boundary of the given dataset's convex hull. Extrapolation occurs when falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional (100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?· youtube
Why LeCun Thinks Deep Learning Isn't Enough — Yann LeCun· youtube
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks
