Gradient Adversarial Training of Neural Networks
Ayan Sinha, Zhao Chen, Vijay Badrinarayanan, Andrew Rabinovich

TL;DR
Gradient adversarial training is a versatile framework that enforces gradient similarity across tasks or models, improving robustness, knowledge transfer, and multi-task learning by using an auxiliary network to classify and align gradient tensors.
Contribution
This paper introduces gradient adversarial training, a novel method that uses an auxiliary network to enforce statistical indistinguishability of gradient tensors across different scenarios.
Findings
Increases network robustness to adversarial attacks.
Enhances knowledge distillation effectiveness.
Improves multi-task learning by aligning gradients.
Abstract
We propose gradient adversarial training, an auxiliary deep learning framework applicable to different machine learning problems. In gradient adversarial training, we leverage a prior belief that in many contexts, simultaneous gradient updates should be statistically indistinguishable from each other. We enforce this consistency using an auxiliary network that classifies the origin of the gradient tensor, and the main network serves as an adversary to the auxiliary network in addition to performing standard task-based training. We demonstrate gradient adversarial training for three different scenarios: (1) as a defense to adversarial examples we classify gradient tensors and tune them to be agnostic to the class of their corresponding example, (2) for knowledge distillation, we do binary classification of gradient tensors derived from the student or teacher network and tune the student…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
