Attack Strength vs. Detectability Dilemma in Adversarial Machine Learning
Christopher Frederickson, Michael Moore, Glenn Dawson, Robi Polikar

TL;DR
This paper explores the balance between attack strength and detectability in adversarial machine learning, showing that stronger attacks are more detectable and proposing a regularization method to reduce detectability.
Contribution
It demonstrates the trade-off between attack effectiveness and detectability and introduces a regularization approach to create less detectable adversarial attacks.
Findings
Stronger attacks are more easily detected.
Adding a regularization term reduces attack detectability.
Simple detection algorithms can identify highly effective attacks.
Abstract
As the prevalence and everyday use of machine learning algorithms, along with our reliance on these algorithms grow dramatically, so do the efforts to attack and undermine these algorithms with malicious intent, resulting in a growing interest in adversarial machine learning. A number of approaches have been developed that can render a machine learning algorithm ineffective through poisoning or other types of attacks. Most attack algorithms typically use sophisticated optimization approaches, whose objective function is designed to cause maximum damage with respect to accuracy and performance of the algorithm with respect to some task. In this effort, we show that while such an objective function is indeed brutally effective in causing maximum damage on an embedded feature selection task, it often results in an attack mechanism that can be easily detected with an embarrassingly simple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
