Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap?
Nathan Inkawhich, Kevin J Liang, Jingyang Zhang, Huanrui Yang, Hai Li,, Yiran Chen

TL;DR
This paper introduces a novel method for targeted adversarial attacks that transfer between models with disjoint label spaces, demonstrating effectiveness and improved efficiency in complex blackbox scenarios.
Contribution
It presents a new approach using class correspondence matrices and proxy class representations to enable transfer attacks across models with no label overlap.
Findings
Transfer attacks succeed despite label space disjointness.
Attack success depends on data properties.
Combining transfer and query-based methods improves efficiency.
Abstract
We design blackbox transfer-based targeted adversarial attacks for an environment where the attacker's source model and the target blackbox model may have disjoint label spaces and training datasets. This scenario significantly differs from the "standard" blackbox setting, and warrants a unique approach to the attacking process. Our methodology begins with the construction of a class correspondence matrix between the whitebox and blackbox label sets. During the online phase of the attack, we then leverage representations of highly related proxy classes from the whitebox distribution to fool the blackbox model into predicting the desired target class. Our attacks are evaluated in three complex and challenging test environments where the source and target models have varying degrees of conceptual overlap amongst their unique categories. Ultimately, we find that it is indeed possible to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications
