Target Training Does Adversarial Training Without Adversarial Samples
Blerta Lindqvist

TL;DR
Target Training is a novel method that enhances neural network robustness by replacing adversarial samples with duplicated original samples labeled differently, eliminating the need for adversarial sample generation.
Contribution
It introduces a new training approach that outperforms existing defenses by avoiding adversarial sample generation and leveraging original samples for improved robustness.
Findings
Achieves higher accuracy than default CIFAR10 and current defenses.
Effectively defends against CW-L2 and DeepFool attacks.
Eliminates the need for adversarial sample generation during training.
Abstract
Neural network classifiers are vulnerable to misclassification of adversarial samples, for which the current best defense trains classifiers with adversarial samples. However, adversarial samples are not optimal for steering attack convergence, based on the minimization at the core of adversarial attacks. The minimization perturbation term can be minimized towards by replacing adversarial samples in training with duplicated original samples, labeled differently only for training. Using only original samples, Target Training eliminates the need to generate adversarial samples for training against all attacks that minimize perturbation. In low-capacity classifiers and without using adversarial samples, Target Training exceeds both default CIFAR10 accuracy (%) and current best defense accuracy (below %) with % against CW-L() attack, and % against…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Bacillus and Francisella bacterial research
