Leveraging Local Variation in Data: Sampling and Weighting Schemes for Supervised Deep Learning
Paul Novello, Ga\"el Po\"ette, David Lugato, Pietro Congedo

TL;DR
This paper introduces a novel data sampling and weighting method called Variance Based Samples Weighting (VBSW) that improves neural network performance by emphasizing data regions where the target function varies steeply, validated across multiple tasks.
Contribution
The paper proposes VBSW, a scalable and cost-effective methodology that leverages local variance to enhance training data weighting for neural networks.
Findings
VBSW significantly improves neural network accuracy across tasks.
Focusing on steep regions of the function enhances learning.
Applicable to diverse models like ResNet and BERT.
Abstract
In the context of supervised learning of a function by a neural network, we claim and empirically verify that the neural network yields better results when the distribution of the data set focuses on regions where the function to learn is steep. We first traduce this assumption in a mathematically workable way using Taylor expansion and emphasize a new training distribution based on the derivatives of the function to learn. Then, theoretical derivations allow constructing a methodology that we call Variance Based Samples Weighting (VBSW). VBSW uses labels local variance to weight the training points. This methodology is general, scalable, cost-effective, and significantly increases the performances of a large class of neural networks for various classification and regression tasks on image, text, and multivariate data. We highlight its benefits with experiments involving neural networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAverage Pooling · 1x1 Convolution · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Kaiming Initialization · Residual Connection · Residual Block · Global Average Pooling · Convolution · Batch Normalization
