Adversarially Robust Classification based on GLRT
Bhagyashree Puranik, Upamanyu Madhow, Ramtin Pedarsani

TL;DR
This paper proposes a defense strategy against adversarial attacks in machine learning based on the generalized likelihood ratio test (GLRT), demonstrating competitive performance and better robustness-accuracy trade-offs in certain settings.
Contribution
It introduces a GLRT-based method for adversarial defense that generalizes to complex models where optimal minimax classifiers are unknown.
Findings
GLRT performs comparably to minimax strategies under worst-case attacks.
GLRT offers improved robustness-accuracy trade-offs under weaker attacks.
The approach naturally extends to more complex models without known optimal classifiers.
Abstract
Machine learning models are vulnerable to adversarial attacks that can often cause misclassification by introducing small but well designed perturbations. In this paper, we explore, in the setting of classical composite hypothesis testing, a defense strategy based on the generalized likelihood ratio test (GLRT), which jointly estimates the class of interest and the adversarial perturbation. We evaluate the GLRT approach for the special case of binary hypothesis testing in white Gaussian noise under norm-bounded adversarial perturbations, a setting for which a minimax strategy optimizing for the worst-case attack is known. We show that the GLRT approach yields performance competitive with that of the minimax approach under the worst-case attack, and observe that it yields a better robustness-accuracy trade-off under weaker attacks, depending on the values of signal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
