Large-Scale Dataset Pruning in Adversarial Training through Data   Importance Extrapolation

Bj\"orn Nieth; Thomas Altstidl; Leo Schwinn; Bj\"orn Eskofier

arXiv:2406.13283·cs.LG·July 12, 2024·1 cites

Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation

Bj\"orn Nieth, Thomas Altstidl, Leo Schwinn, Bj\"orn Eskofier

PDF

Open Access 1 Repo

TL;DR

This paper introduces a data pruning method for adversarial training that extrapolates data importance from small samples to large datasets, reducing training data without sacrificing robustness.

Contribution

It presents a novel data importance extrapolation technique specifically designed for adversarial training to efficiently prune datasets.

Findings

01

Reduces dataset size while maintaining robustness

02

Efficiently extrapolates importance scores from small to large datasets

03

Improves training efficiency in adversarial settings

Abstract

Their vulnerability to small, imperceptible attacks limits the adoption of deep learning models to real-world systems. Adversarial training has proven to be one of the most promising strategies against these attacks, at the expense of a substantial increase in training time. With the ongoing trend of integrating large-scale synthetic data this is only expected to increase even further. Thus, the need for data-centric approaches that reduce the number of training samples while maintaining accuracy and robustness arises. While data pruning and active learning are prominent research topics in deep learning, they are as of now largely unexplored in the adversarial training literature. We address this gap and propose a new data pruning strategy based on extrapolating data importance scores from a small set of data to a larger set. In an empirical evaluation, we demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BjoernNieth/LS-Dataset-pruning-in-AT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Digital Media Forensic Detection

MethodsSparse Evolutionary Training · Pruning