Benefit of Interpolation in Nearest Neighbor Algorithms

Yue Xing; Qifan Song; Guang Cheng

arXiv:2202.11817·stat.ML·February 25, 2022·1 cites

Benefit of Interpolation in Nearest Neighbor Algorithms

Yue Xing, Qifan Song, Guang Cheng

PDF

Open Access

TL;DR

This paper demonstrates that in nearest neighbor algorithms, a mild degree of data interpolation can improve prediction accuracy and stability, challenging the notion that zero training error always harms generalization.

Contribution

It introduces a class of interpolated weighting schemes in nearest neighbors, revealing a U-shaped performance curve and showing that mild interpolation can enhance predictive performance.

Findings

01

Mild data interpolation improves NN prediction accuracy.

02

Zero training error does not necessarily harm generalization.

03

Universality of results across distance measures and corrupted data.

Abstract

In some studies \citep[e.g.,][]{zhang2016understanding} of deep learning, it is observed that over-parametrized deep neural networks achieve a small testing error even when the training error is almost zero. Despite numerous works towards understanding this so-called "double descent" phenomenon \citep[e.g.,][]{belkin2018reconciling,belkin2019two}, in this paper, we turn into another way to enforce zero training error (without over-parametrization) through a data interpolation mechanism. Specifically, we consider a class of interpolated weighting schemes in the nearest neighbors (NN) algorithms. By carefully characterizing the multiplicative constant in the statistical risk, we reveal a U-shaped performance curve for the level of data interpolation in both classification and regression setups. This sharpens the existing result \citep{belkin2018does} that zero training error does not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Neural Networks and Applications