Adapting to Evolving Adversaries with Regularized Continual Robust   Training

Sihui Dai; Christian Cianfarani; Arjun Bhagoji; Vikash Sehwag; Prateek; Mittal

arXiv:2502.04248·cs.LG·February 7, 2025

Adapting to Evolving Adversaries with Regularized Continual Robust Training

Sihui Dai, Christian Cianfarani, Arjun Bhagoji, Vikash Sehwag, Prateek, Mittal

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a regularized continual robust training method that enhances model robustness against evolving adversarial attacks by maintaining robustness across multiple attack types through logit space regularization.

Contribution

It proposes a novel regularization technique based on logit space distance to improve robustness in continual adversarial training, supported by theoretical analysis and extensive experiments.

Findings

01

Regularization based on logit space distance improves robustness against multiple attacks.

02

The method maintains robustness on previous attacks while adapting to new ones.

03

Achieves better robust accuracy with minimal training overhead.

Abstract

Robust training methods typically defend against specific attack types, such as Lp attacks with fixed budgets, and rarely account for the fact that defenders may encounter new attacks over time. A natural solution is to adapt the defended model to new adversaries as they arise via fine-tuning, a method which we call continual robust training (CRT). However, when implemented naively, fine-tuning on new attacks degrades robustness on previous attacks. This raises the question: how can we improve the initial training and fine-tuning of the model to simultaneously achieve robustness against previous and new attacks? We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space, suggesting that regularizing with respect to this logit space distance can help maintain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inspire-group/continual_robust_training
pytorchOfficial

Videos

Adapting to Evolving Adversaries with Regularized Continual Robust Training· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis