Target Model Agnostic Adversarial Attacks with Query Budgets on Language Understanding Models
Jatin Chauhan, Karan Bhukar, Manohar Kaul

TL;DR
This paper introduces a novel, target model agnostic adversarial attack method for language understanding models that is highly transferable and effective under limited query budgets, addressing real-world attack constraints.
Contribution
The paper proposes a new adversarial attack approach that is model-agnostic, highly transferable, and effective with limited queries, improving robustness testing of language models.
Findings
Achieves high transferability across different models.
Effective under strict query budget constraints.
Outperforms baseline attack methods.
Abstract
Despite significant improvements in natural language understanding models with the advent of models like BERT and XLNet, these neural-network based classifiers are vulnerable to blackbox adversarial attacks, where the attacker is only allowed to query the target model outputs. We add two more realistic restrictions on the attack methods, namely limiting the number of queries allowed (query budget) and crafting attacks that easily transfer across different pre-trained models (transferability), which render previous attack models impractical and ineffective. Here, we propose a target model agnostic adversarial attack method with a high degree of attack transferability across the attacked models. Our empirical studies show that in comparison to baseline methods, our method generates highly transferable adversarial sentences under the restriction of limited query budgets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Natural Language Processing Techniques
MethodsMulti-Head Attention · Linear Layer · Byte Pair Encoding · Attention Is All You Need · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · SentencePiece · Residual Connection · WordPiece
