Towards Adversarial Robustness via Debiased High-Confidence Logit Alignment
Kejia Zhang, Juanjuan Weng, Shaozi Li, Zhiming Luo

TL;DR
This paper introduces DHAT, a novel adversarial training method that reduces background feature bias in DNNs, significantly improving adversarial robustness and generalization on CIFAR and ImageNet-1K benchmarks.
Contribution
The paper proposes a debiased high-confidence logit alignment method (DHAT) to mitigate background feature bias in adversarial training, enhancing robustness and generalization.
Findings
DHAT achieves state-of-the-art robustness on CIFAR and ImageNet-1K.
DHAT significantly reduces reliance on background features during training.
DHAT improves model generalization by mitigating feature bias.
Abstract
Despite the remarkable progress of deep neural networks (DNNs) in various visual tasks, their vulnerability to adversarial examples raises significant security concerns. Recent adversarial training methods leverage inverse adversarial attacks to generate high-confidence examples, aiming to align adversarial distributions with high-confidence class regions. However, our investigation reveals that under inverse adversarial attacks, high-confidence outputs are influenced by biased feature activations, causing models to rely on background features that lack a causal relationship with the labels. This spurious correlation bias leads to overfitting irrelevant background features during adversarial training, thereby degrading the model's robust performance and generalization capabilities. To address this issue, we propose Debiased High-Confidence Adversarial Training (DHAT), a novel approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Forensic and Genetic Research
MethodsSoftmax · Attention Is All You Need · ALIGN
