White-Box Multi-Objective Adversarial Attack on Dialogue Generation

Yufei Li; Zexin Li; Yingfan Gao; Cong Liu

arXiv:2305.03655·cs.CL·May 9, 2023·1 cites

White-Box Multi-Objective Adversarial Attack on Dialogue Generation

Yufei Li, Zexin Li, Yingfan Gao, Cong Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces DGSlow, a white-box multi-objective adversarial attack method that effectively degrades dialogue generation models by crafting irrelevant, lengthy, and repetitive responses through gradient-based optimization.

Contribution

The paper presents a novel multi-objective attack approach for dialogue systems that balances generation accuracy and length, improving attack success and transferability.

Findings

01

DGSlow significantly reduces dialogue model performance.

02

The attack achieves higher success rates than traditional methods.

03

Crafted adversarial samples transfer well across models.

Abstract

Pre-trained transformers are popular in state-of-the-art dialogue generation (DG) systems. Such language models are, however, vulnerable to various adversarial samples as studied in traditional tasks such as text classification, which inspires our curiosity about their robustness in DG systems. One main challenge of attacking DG models is that perturbations on the current sentence can hardly degrade the response accuracy because the unchanged chat histories are also considered for decision-making. Instead of merely pursuing pitfalls of performance metrics such as BLEU, ROUGE, we observe that crafting adversarial samples to force longer generation outputs benefits attack effectiveness -- the generated responses are typically irrelevant, lengthy, and repetitive. To this end, we propose a white-box multi-objective attack method called DGSlow. Specifically, DGSlow balances two objectives --…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yul091/dgslow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications