
TL;DR
This paper explores how modern machine learning models, especially deep neural networks, can successfully interpolate noisy data and generalize well, challenging traditional wisdom and connecting empirical and theoretical modeling approaches.
Contribution
It introduces the concept of Statistically Consistent Interpolation (SCI) and the wiNN algorithm, demonstrating data interpolation as a viable ML strategy for large datasets.
Findings
SCI clarifies the relation between physical and empirical modeling.
wiNN algorithm effectively interpolates noisy data with good generalization.
Modern ML differs from physical theories and animal brains in its epistemological approach.
Abstract
Textbook wisdom advocates for smooth function fits and implies that interpolation of noisy data should lead to poor generalization. A related heuristic is that fitting parameters should be fewer than measurements (Occam's Razor). Surprisingly, contemporary machine learning (ML) approaches, cf. deep nets (DNNs), generalize well despite interpolating noisy data. This may be understood via Statistically Consistent Interpolation (SCI), i.e. data interpolation techniques that generalize optimally for big data. In this article we elucidate SCI using the weighted interpolating nearest neighbors (wiNN) algorithm, which adds singular weight functions to kNN (k-nearest neighbors). This shows that data interpolation can be a valid ML strategy for big data. SCI clarifies the relation between two ways of modeling natural phenomena: the rationalist approach (strong priors) of theoretical physics with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Neural Networks and Applications · Cell Image Analysis Techniques
