Towards Predicting the Success of Transfer-based Attacks by Quantifying Shared Feature Representations
Ashley S. Dale, Mei Qiu, Foo Bin Che, Thomas Bsaibes, Lauren, Christopher, Paul Salama

TL;DR
This paper investigates whether shared feature representations between models can predict the success of transfer-based attacks in black-box settings, using a novel methodology to quantify feature similarity and its correlation with attack transferability.
Contribution
It introduces a new method to predict attack success by measuring shared feature representations, confirming their correlation with transfer attack effectiveness.
Findings
Shared feature representations moderately correlate with attack success (=0.56)
Shared features exist across models of different sizes and complexities
Datasets from various domains can interpret black-box feature representations
Abstract
Much effort has been made to explain and improve the success of transfer-based attacks (TBA) on black-box computer vision models. This work provides the first attempt at a priori prediction of attack success by identifying the presence of vulnerable features within target models. Recent work by Chen and Liu (2024) proposed the manifold attack model, a unifying framework proposing that successful TBA exist in a common manifold space. Our work experimentally tests the common manifold space hypothesis by a new methodology: first, projecting feature vectors from surrogate and target feature extractors trained on ImageNet onto the same low-dimensional manifold; second, quantifying any observed structure similarities on the manifold; and finally, by relating these observed similarities to the success of the TBA. We find that shared feature representation moderately correlates with increased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSuicide and Self-Harm Studies · Forensic and Genetic Research · Spam and Phishing Detection
