Understanding Adversarial Training with Energy-based Models
Mujtaba Hussain Mirza, Maria Rosaria Briglia, Filippo Bartolucci, Senad Beadini, Giuseppe Lisanti, Iacopo Masi

TL;DR
This paper uses an energy-based model framework to analyze adversarial training, revealing energy dynamics related to overfitting, and proposes a regularizer to improve robustness and generative capabilities of classifiers.
Contribution
It introduces the Delta Energy Regularizer (DER) to mitigate overfitting in adversarial training and enhances class-specific generative sampling using energy-based guidance and PCA.
Findings
DER effectively reduces catastrophic and robust overfitting.
Energy analysis reveals key dynamics during adversarial training.
Improved generative sampling achieves competitive IS and FID scores.
Abstract
We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The central focus of our work is to understand the critical phenomena of Catastrophic Overfitting (CO) and Robust Overfitting (RO) in AT from an energy perspective. We analyze the impact of existing AT approaches on the energy of samples during training and observe that the behavior of the ``delta energy' -- change in energy between original sample and its adversarial counterpart -- diverges significantly when CO or RO occurs. After a thorough analysis of these energy dynamics and their relationship…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)
MethodsFocus
