Adv-OLM: Generating Textual Adversaries via OLM

Vijit Malik; Ashwani Bhat; Ashutosh Modi

arXiv:2101.08523·cs.CL·January 22, 2021

Adv-OLM: Generating Textual Adversaries via OLM

Vijit Malik, Ashwani Bhat, Ashutosh Modi

PDF

1 Repo

TL;DR

Adv-OLM introduces a black-box attack method that leverages occlusion and language models to generate effective adversarial examples for NLP models, enhancing understanding of model vulnerabilities.

Contribution

This paper presents Adv-OLM, a novel black-box attack approach that adapts OLM for generating textual adversaries, outperforming existing methods in NLP attack scenarios.

Findings

01

Adv-OLM outperforms other attack methods on multiple text classification tasks.

02

OLM effectively ranks words for substitution, improving attack success.

03

The approach enhances understanding of model robustness against adversarial inputs.

Abstract

Deep learning models are susceptible to adversarial examples that have imperceptible perturbations in the original input, resulting in adversarial attacks against these models. Analysis of these attacks on the state of the art transformers in NLP can help improve the robustness of these models against such adversarial inputs. In this paper, we present Adv-OLM, a black-box attack method that adapts the idea of Occlusion and Language Models (OLM) to the current state of the art attack methods. OLM is used to rank words of a sentence, which are later substituted using word replacement strategies. We experimentally show that our approach outperforms other attack methods for several text classification tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vijit-m/Adv-OLM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.