TL;DR
This paper introduces a novel adversarial defense method using optimal transport theory, resulting in a more robust classifier across various perturbation sizes, validated through extensive experiments on CIFAR datasets.
Contribution
It proposes Sinkhorn Adversarial Training (SAT), leveraging optimal transport to improve robustness against adversarial attacks and introduces AUAC for better robustness evaluation.
Findings
SAT outperforms existing defenses on CIFAR datasets.
Robustness is more accurately measured across perturbation sizes.
Optimal transport-based loss enhances model resilience.
Abstract
Deep learning classifiers are now known to have flaws in the representations of their class. Adversarial attacks can find a human-imperceptible perturbation for a given image that will mislead a trained model. The most effective methods to defend against such attacks trains on generated adversarial examples to learn their distribution. Previous work aimed to align original and adversarial image representations in the same way as domain adaptation to improve robustness. Yet, they partially align the representations using approaches that do not reflect the geometry of space and distribution. In addition, it is difficult to accurately compare robustness between defended models. Until now, they have been evaluated using a fixed perturbation size. However, defended models may react differently to variations of this perturbation size. In this paper, the analogy of domain adaptation is taken a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
