PromptAttack: Probing Dialogue State Trackers with Adversarial Prompts

Xiangjue Dong; Yun He; Ziwei Zhu; James Caverlee

arXiv:2306.04535·cs.CL·June 8, 2023·1 cites

PromptAttack: Probing Dialogue State Trackers with Adversarial Prompts

Xiangjue Dong, Yun He, Ziwei Zhu, James Caverlee

PDF

Open Access 1 Repo

TL;DR

This paper introduces a prompt-based adversarial attack method to evaluate and improve dialogue state trackers, demonstrating significant accuracy reduction and potential for enhancing robustness through adversarial training.

Contribution

It presents a novel prompt-based approach for generating natural adversarial examples targeting DSTs without needing model parameters.

Findings

01

Achieves high attack success rate on state-of-the-art DSTs

02

Maintains fluency and low perturbation in adversarial examples

03

Adversarial training improves DST robustness

Abstract

A key component of modern conversational systems is the Dialogue State Tracker (or DST), which models a user's goals and needs. Toward building more robust and reliable DSTs, we introduce a prompt-based learning approach to automatically generate effective adversarial examples to probe DST models. Two key characteristics of this approach are: (i) it only needs the output of the DST with no need for model parameters, and (ii) it can learn to generate natural language utterances that can target any DST. Through experiments over state-of-the-art DSTs, the proposed framework leads to the greatest reduction in accuracy and the best attack success rate while maintaining good fluency and a low perturbation ratio. We also show how much the generated adversarial examples can bolster a DST through adversarial training. These results indicate the strength of prompt-based attacks on DSTs and leave…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dongxiangjue/promptattack
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Cardiac Arrest and Resuscitation

MethodsDynamic Sparse Training