PUMA: margin-based data pruning
Javier Maroto, Pascal Frossard

TL;DR
PUMA is a novel data pruning method that uses margin-based criteria with DeepFool to improve adversarial robustness and accuracy, reducing data needs and enhancing the robustness-accuracy trade-off in deep learning models.
Contribution
The paper introduces PUMA, a new margin-based data pruning strategy that effectively improves robustness and accuracy by jointly adjusting training attack norms, outperforming existing pruning methods.
Findings
PUMA achieves similar robustness with less data.
It significantly improves model accuracy over existing methods.
PUMA enhances the robustness-accuracy trade-off in adversarial training.
Abstract
Deep learning has been able to outperform humans in terms of classification accuracy in many tasks. However, to achieve robustness to adversarial perturbations, the best methodologies require to perform adversarial training on a much larger training set that has been typically augmented using generative models (e.g., diffusion models). Our main objective in this work, is to reduce these data requirements while achieving the same or better accuracy-robustness trade-offs. We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin). We find that the existing approaches that prune samples with low margin fails to increase robustness when we add a lot of synthetic data, and explain this situation with a perceptron learning task. Moreover, we find that pruning high margin samples for better accuracy increases the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Advanced Clustering Algorithms Research · Data Management and Algorithms
MethodsSparse Evolutionary Training · Pruning · Diffusion · Focus
