LossVal: Efficient Data Valuation for Neural Networks

Tim Wibiral; Mohamed Karim Belaid; Maximilian Rabus; Ansgar Scherp

arXiv:2412.04158·cs.LG·December 18, 2024

LossVal: Efficient Data Valuation for Neural Networks

Tim Wibiral, Mohamed Karim Belaid, Maximilian Rabus, Ansgar Scherp

PDF

Open Access 1 Repo

TL;DR

LossVal is a novel, efficient data valuation method that computes sample importance during neural network training, enabling effective identification of noisy and harmful data points with reduced computational costs.

Contribution

It introduces LossVal, a self-weighting mechanism integrated into loss functions for real-time data importance estimation during training.

Findings

01

Effectively identifies noisy samples in datasets.

02

Distinguishes helpful from harmful training data.

03

Reduces computational costs compared to traditional methods.

Abstract

Assessing the importance of individual training samples is a key challenge in machine learning. Traditional approaches retrain models with and without specific samples, which is computationally expensive and ignores dependencies between data points. We introduce LossVal, an efficient data valuation method that computes importance scores during neural network training by embedding a self-weighting mechanism into loss functions like cross-entropy and mean squared error. LossVal reduces computational costs, making it suitable for large datasets and practical applications. Experiments on classification and regression tasks across multiple datasets show that LossVal effectively identifies noisy samples and is able to distinguish helpful from harmful samples. We examine the gradient calculation of LossVal to highlight its advantages. The source code is available at:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

twibiral/lossval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)