Tougher Text, Smarter Models: Raising the Bar for Adversarial Defence Benchmarks
Yang Wang, Chenghua Lin

TL;DR
This paper introduces a comprehensive benchmark for evaluating textual adversarial defenses across multiple datasets, tasks, and models, aiming to improve robustness in natural language processing systems.
Contribution
It provides an extensive, standardized benchmark for assessing state-of-the-art adversarial defense mechanisms in NLP, covering diverse datasets and tasks.
Findings
Benchmark reveals strengths and weaknesses of current defenses.
Evaluation across multiple tasks highlights areas needing improvement.
Establishes a new standard for future adversarial robustness research.
Abstract
Recent advancements in natural language processing have highlighted the vulnerability of deep learning models to adversarial attacks. While various defence mechanisms have been proposed, there is a lack of comprehensive benchmarks that evaluate these defences across diverse datasets, models, and tasks. In this work, we address this gap by presenting an extensive benchmark for textual adversarial defence that significantly expands upon previous work. Our benchmark incorporates a wide range of datasets, evaluates state-of-the-art defence mechanisms, and extends the assessment to include critical tasks such as single-sentence classification, similarity and paraphrase identification, natural language inference, and commonsense reasoning. This work not only serves as a valuable resource for researchers and practitioners in the field of adversarial robustness but also identifies key areas for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital and Cyber Forensics · Adversarial Robustness in Machine Learning
