Understanding the Role of Adversarial Regularization in Supervised Learning
Litu Rout

TL;DR
This paper investigates the theoretical foundations of adversarial regularization in supervised learning, analyzing its convergence, gradient flow, and generalization, supported by empirical evidence of its acceleration effects.
Contribution
It provides a theoretical analysis of adversarial regularization's performance, including convergence and generalization, and questions existing capacity-based bounds in this context.
Findings
Adversarial regularization accelerates gradient descent.
Existing capacity bounds may not fully explain generalization in adversarial learning.
Distinct behaviors observed in neural embedded vector spaces under adversarial training.
Abstract
Despite numerous attempts sought to provide empirical evidence of adversarial regularization outperforming sole supervision, the theoretical understanding of such phenomena remains elusive. In this study, we aim to resolve whether adversarial regularization indeed performs better than sole supervision at a fundamental level. To bring this insight into fruition, we study vanishing gradient issue, asymptotic iteration complexity, gradient flow and provable convergence in the context of sole supervision and adversarial regularization. The key ingredient is a theoretical justification supported by empirical evidence of adversarial acceleration in gradient descent. In addition, motivated by a recently introduced unit-wise capacity based generalization bound, we analyze the generalization error in adversarial framework. Guided by our observation, we cast doubts on the ability of this measure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques
