Learning in High Dimension Always Amounts to Extrapolation

Randall Balestriero; Jerome Pesenti; Yann LeCun

arXiv:2110.09485·cs.LG·November 2, 2021·26 cites

Learning in High Dimension Always Amounts to Extrapolation

Randall Balestriero, Jerome Pesenti, Yann LeCun

PDF

Open Access 2 Videos

TL;DR

This paper argues that in high-dimensional datasets, true interpolation is almost impossible, challenging common beliefs about how algorithms generalize and perform in such settings.

Contribution

It provides empirical and theoretical evidence that interpolation rarely occurs in high-dimensional data, questioning existing notions of generalization.

Findings

01

Interpolation almost never occurs in datasets with more than 100 dimensions.

02

Current definitions of interpolation/extrapolation may not accurately reflect generalization.

03

Theoretical models show high-dimensional data behave differently from low-dimensional intuition.

Abstract

The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample $x$ whenever this sample falls inside or on the boundary of the given dataset's convex hull. Extrapolation occurs when $x$ falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional ( $>$ 100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?· youtube

Why LeCun Thinks Deep Learning Isn't Enough — Yann LeCun· youtube

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks