Fitting Elephants

Partha P Mitra

arXiv:2104.00526·cs.LG·April 2, 2021

Fitting Elephants

Partha P Mitra

PDF

Open Access

TL;DR

This paper explores how modern machine learning models, especially deep neural networks, can successfully interpolate noisy data and generalize well, challenging traditional wisdom and connecting empirical and theoretical modeling approaches.

Contribution

It introduces the concept of Statistically Consistent Interpolation (SCI) and the wiNN algorithm, demonstrating data interpolation as a viable ML strategy for large datasets.

Findings

01

SCI clarifies the relation between physical and empirical modeling.

02

wiNN algorithm effectively interpolates noisy data with good generalization.

03

Modern ML differs from physical theories and animal brains in its epistemological approach.

Abstract

Textbook wisdom advocates for smooth function fits and implies that interpolation of noisy data should lead to poor generalization. A related heuristic is that fitting parameters should be fewer than measurements (Occam's Razor). Surprisingly, contemporary machine learning (ML) approaches, cf. deep nets (DNNs), generalize well despite interpolating noisy data. This may be understood via Statistically Consistent Interpolation (SCI), i.e. data interpolation techniques that generalize optimally for big data. In this article we elucidate SCI using the weighted interpolating nearest neighbors (wiNN) algorithm, which adds singular weight functions to kNN (k-nearest neighbors). This shows that data interpolation can be a valid ML strategy for big data. SCI clarifies the relation between two ways of modeling natural phenomena: the rationalist approach (strong priors) of theoretical physics with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Neural Networks and Applications · Cell Image Analysis Techniques