CERT-ED: Certifiably Robust Text Classification for Edit Distance

Zhuoqun Huang; Neil G Marchant; Olga Ohrimenko; Benjamin I. P.; Rubinstein

arXiv:2408.00728·cs.CL·August 2, 2024

CERT-ED: Certifiably Robust Text Classification for Edit Distance

Zhuoqun Huang, Neil G Marchant, Olga Ohrimenko, Benjamin I. P., Rubinstein

PDF

Open Access

TL;DR

This paper introduces CERT-ED, a certifiably robust text classification method that extends randomized smoothing to cover all edit operations, significantly improving robustness against adversarial attacks in NLP.

Contribution

It adapts Randomized Deletion and proposes CERT-ED, the first certification method for all edit operations in NLP classification, outperforming previous methods in accuracy and robustness.

Findings

01

CERT-ED outperforms RanMASK in 4 out of 5 datasets.

02

Improves empirical robustness in 38 out of 50 attack settings.

03

Covers various threat models, including direct and transfer attacks.

Abstract

With the growing integration of AI in daily life, ensuring the robustness of systems to inference-time attacks is crucial. Among the approaches for certifying robustness to such adversarial examples, randomized smoothing has emerged as highly promising due to its nature as a wrapper around arbitrary black-box models. Previous work on randomized smoothing in natural language processing has primarily focused on specific subsets of edit distance operations, such as synonym substitution or word insertion, without exploring the certification of all edit operations. In this paper, we adapt Randomized Deletion (Huang et al., 2023) and propose, CERTified Edit Distance defense (CERT-ED) for natural language classification. Through comprehensive experiments, we demonstrate that CERT-ED outperforms the existing Hamming distance method RanMASK (Zeng et al., 2023) in 4 out of 5 datasets in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRandomized Smoothing · Randomized Deletion