Data Pruning via Moving-one-Sample-out
Haoru Tan, Sitong Wu, Fei Du, Yukang Chen, Zhibin Wang, Fan Wang,, Xiaojuan Qi

TL;DR
This paper introduces MoSo, a data-pruning method that efficiently identifies and removes less informative samples by approximating their impact on empirical risk, improving training efficiency and robustness.
Contribution
The paper presents a novel first-order approximation method for data pruning that reduces computational cost compared to traditional leave-one-out approaches.
Findings
MoSo effectively maintains performance at high pruning ratios.
The method reduces training time by avoiding retraining for each sample removal.
Experimental results show improved robustness and efficiency.
Abstract
In this paper, we propose a novel data-pruning approach called moving-one-sample-out (MoSo), which aims to identify and remove the least informative samples from the training set. The core insight behind MoSo is to determine the importance of each sample by assessing its impact on the optimal empirical risk. This is achieved by measuring the extent to which the empirical risk changes when a particular sample is excluded from the training set. Instead of using the computationally expensive leaving-one-out-retraining procedure, we propose an efficient first-order approximator that only requires gradient information from different training stages. The key idea behind our approximation is that samples with gradients that are consistently aligned with the average gradient of the training set are more informative and should receive higher scores, which could be intuitively understood as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM
MethodsSparse Evolutionary Training · Pruning
