Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
Kaijie Zhu, Jindong Wang, Xixu Hu, Xing Xie, Ge Yang

TL;DR
This paper introduces RiFT, a novel fine-tuning method that improves the generalization and out-of-distribution robustness of adversarially trained neural networks without sacrificing their adversarial robustness.
Contribution
RiFT leverages module robust criticality to identify non-robust-critical modules and fine-tunes them, enhancing model generalization and robustness.
Findings
Significantly improves generalization by around 1.5%.
Maintains or slightly enhances adversarial robustness.
Effective on multiple ResNet architectures and datasets.
Abstract
Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness. The core idea of RiFT is to exploit the redundant capacity for robustness by fine-tuning the adversarially trained model on its non-robust-critical module. To do so, we introduce module robust criticality (MRC), a measure that evaluates the significance of a given module to model robustness under worst-case weight perturbations. Using this measure, we identify the module with the lowest MRC value as the non-robust-critical module and fine-tune its weights to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · Anomaly Detection Techniques and Applications
