TL;DR
This paper systematically studies the interaction between adversarial inputs and poisoned models in deep learning, revealing mutual reinforcement effects that amplify attack effectiveness and discussing countermeasures.
Contribution
It introduces a unified framework for jointly optimizing adversarial inputs and poisoned models, uncovering their mutual reinforcement and implications for attack enhancement.
Findings
Mutual reinforcement amplifies attack effectiveness.
Joint optimization enables more evasive attacks.
Countermeasure challenges are discussed.
Abstract
Despite their tremendous success in a range of domains, deep learning systems are inherently susceptible to two types of manipulations: adversarial inputs -- maliciously crafted samples that deceive target deep neural network (DNN) models, and poisoned models -- adversely forged DNNs that misbehave on pre-defined inputs. While prior work has intensively studied the two attack vectors in parallel, there is still a lack of understanding about their fundamental connections: what are the dynamic interactions between the two attack vectors? what are the implications of such interactions for optimizing existing attacks? what are the potential countermeasures against the enhanced attacks? Answering these key questions is crucial for assessing and mitigating the holistic vulnerabilities of DNNs deployed in realistic settings. Here we take a solid step towards this goal by conducting the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
