Prompt Perturbation Consistency Learning for Robust Language Models
Yao Qiang, Subhrangshu Nandi, Ninareh Mehrabi, Greg Ver Steeg, Anoop, Kumar, Anna Rumshisky, Aram Galstyan

TL;DR
This paper introduces Prompt Perturbation Consistency Learning (PPCL), a method to improve the robustness of large language models on sequence labeling tasks by regularizing their responses to input perturbations, achieving significant performance recovery.
Contribution
The paper demonstrates that fine-tuning large language models can match discriminative models on IC-SF tasks, analyzes robustness issues, and proposes PPCL to enhance perturbation resilience with fewer data.
Findings
PPCL recovers 59% of performance drop in IC tasks.
PPCL recovers 69% of performance drop in SF tasks.
PPCL outperforms data augmentation with fewer samples.
Abstract
Large language models (LLMs) have demonstrated impressive performance on a number of natural language processing tasks, such as question answering and text summarization. However, their performance on sequence labeling tasks such as intent classification and slot filling (IC-SF), which is a central component in personal assistant systems, lags significantly behind discriminative models. Furthermore, there is a lack of substantive research on the robustness of LLMs to various perturbations in the input prompts. The contributions of this paper are three-fold. First, we show that fine-tuning sufficiently large LLMs can produce IC-SF performance comparable to discriminative models. Next, we systematically analyze the performance deterioration of those fine-tuned models due to three distinct yet relevant types of input perturbations - oronyms, synonyms, and paraphrasing. Finally, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
