A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Rishabh Maheshwary; Saket Maheshwary; Vikram Pudi

arXiv:2109.04775·cs.CL·September 13, 2021

A Strong Baseline for Query Efficient Attacks in a Black Box Setting

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a query-efficient black box attack method for NLP models that reduces query count by 75% and improves success rates by combining attention mechanisms with locality sensitive hashing.

Contribution

The paper presents a novel attack strategy that jointly uses attention and LSH to enhance query efficiency and maintain a consistent search space in black box NLP adversarial attacks.

Findings

01

Reduces query count by 75% on average

02

Achieves higher success rate in limited query scenarios

03

Demonstrates effectiveness across multiple datasets and models

Abstract

Existing black box search methods have achieved high success rate in generating adversarial attacks against NLP models. However, such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks. Also, prior attacks do not maintain a consistent search space while comparing different search methods. In this paper, we propose a query efficient attack strategy to generate plausible adversarial examples on text classification and entailment tasks. Our attack jointly leverages attention mechanism and locality sensitive hashing (LSH) to reduce the query count. We demonstrate the efficacy of our approach by comparing our attack with four baselines across three different search spaces. Further, we benchmark our results across the same search space used in prior attacks. In comparison to attacks proposed, on an average, we are able to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rishabhmaheshwary/query-attack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Spam and Phishing Detection