Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning
Jirong Yi, Raghu Mudumbai, Weiyu Xu

TL;DR
This paper develops a theoretical framework for designing optimal adversarial attacks on decision systems using information theory, revealing fundamental vulnerabilities and the impact of redundancy on attack difficulty.
Contribution
It introduces a method to derive optimal adversarial perturbations based on mutual information minimization, advancing understanding of adversarial vulnerability from an information-theoretic perspective.
Findings
Optimal attacks minimize mutual information between signals and labels.
Redundancy in input signals makes adversarial attacks more difficult.
Experimental results support theoretical predictions.
Abstract
We consider the theoretical problem of designing an optimal adversarial attack on a decision system that maximally degrades the achievable performance of the system as measured by the mutual information between the degraded signal and the label of interest. This problem is motivated by the existence of adversarial examples for machine learning classifiers. By adopting an information theoretic perspective, we seek to identify conditions under which adversarial vulnerability is unavoidable i.e. even optimally designed classifiers will be vulnerable to small adversarial perturbations. We present derivations of the optimal adversarial attacks for discrete and continuous signals of interest, i.e., finding the optimal perturbation distributions to minimize the mutual information between the degraded signal and a signal following a continuous or discrete distribution. In addition, we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Physical Unclonable Functions (PUFs) and Hardware Security
