The Missing Margin: How Sample Corruption Affects Distance to the Boundary in ANNs
Marthinus W. Theunissen, Coenraad Mouton, Marelie H. Davel

TL;DR
This paper investigates how sample corruption impacts the classification margins in neural networks, revealing that certain samples consistently have small margins and affect generalization differently, supported by experiments on corrupted image datasets.
Contribution
It uncovers nuanced effects of sample corruption on margins and generalization, linking margin size to sample remoteness and minimum distance to different-target samples.
Findings
Samples with small margins influence generalization variably.
Margin size correlates with sample remoteness and distance to different classes.
Empirical evidence from corrupted MNIST and CIFAR10 datasets supports these insights.
Abstract
Classification margins are commonly used to estimate the generalization ability of machine learning models. We present an empirical study of these margins in artificial neural networks. A global estimate of margin size is usually used in the literature. In this work, we point out seldom considered nuances regarding classification margins. Notably, we demonstrate that some types of training samples are modelled with consistently small margins while affecting generalization in different ways. By showing a link with the minimum distance to a different-target sample and the remoteness of samples from one another, we provide a plausible explanation for this observation. We support our findings with an analysis of fully-connected networks trained on noise-corrupted MNIST data, as well as convolutional networks trained on noise-corrupted CIFAR10 data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
