Fit without fear: remarkable mathematical phenomena of deep learning   through the prism of interpolation

Mikhail Belkin

arXiv:2105.14368·stat.ML·June 1, 2021

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

Mikhail Belkin

PDF

1 Repo

TL;DR

This paper explores the mathematical phenomena of deep learning, focusing on interpolation and over-parameterization, to better understand generalization and optimization in neural networks.

Contribution

It synthesizes emerging mathematical insights on interpolation and over-parameterization, advancing the theoretical understanding of deep learning.

Findings

01

Interpolation enables fitting noisy data exactly.

02

Over-parameterization provides flexibility in model selection.

03

The prism analogy helps disentangle generalization and optimization.

Abstract

In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation, and its sibling, over-parameterization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parameterization enables interpolation and provides flexibility to select a right interpolating model. As we will see, just as a physical prism separates colors mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aradha5772/deep_learning_theory_tutorial
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.