
TL;DR
This paper reviews the challenges and developments in AI safety, focusing on adversarial attacks and defenses, highlighting their implications for reliability and interpretability in high-risk applications.
Contribution
It provides an overview of adversarial attack algorithms, defense strategies, and theoretical questions about attack existence and computability, targeting applied and computational mathematics researchers.
Findings
Adversarial attacks expose vulnerabilities in deep learning models.
Ongoing escalation between attack and defense strategies.
Theoretical insights into attack existence and computability.
Abstract
Over the last decade, adversarial attack algorithms have revealed instabilities in deep learning tools. These algorithms raise issues regarding safety, reliability and interpretability in artificial intelligence; especially in high risk settings. From a practical perspective, there has been a war of escalation between those developing attack and defence strategies. At a more theoretical level, researchers have also studied bigger picture questions concerning the existence and computability of attacks. Here we give a brief overview of the topic, focusing on aspects that are likely to be of interest to researchers in applied and computational mathematics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
