Certified Robustness Under Bounded Levenshtein Distance

Elias Abad Rocamora; Grigorios G. Chrysos; Volkan Cevher

arXiv:2501.13676·cs.LG·February 21, 2025

Certified Robustness Under Bounded Levenshtein Distance

Elias Abad Rocamora, Grigorios G. Chrysos, Volkan Cevher

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces LipsLev, a novel method to efficiently compute certified robustness of text classifiers against bounded Levenshtein distance perturbations, significantly improving speed and accuracy over previous methods.

Contribution

We develop the first approach to estimate Lipschitz constants for convolutional classifiers with respect to Levenshtein distance, enabling fast certification of robustness in text models.

Findings

01

Achieved 38.80% verified accuracy at distance 1 on AG-News

02

Achieved 13.93% verified accuracy at distance 2 on AG-News

03

LipsLev is 4 orders of magnitude faster than existing methods

Abstract

Text classifiers suffer from small perturbations, that if chosen adversarially, can dramatically change the output of the model. Verification methods can provide robustness certificates against such adversarial perturbations, by computing a sound lower bound on the robust accuracy. Nevertheless, existing verification methods incur in prohibitive costs and cannot practically handle Levenshtein distance constraints. We propose the first method for computing the Lipschitz constant of convolutional classifiers with respect to the Levenshtein distance. We use these Lipschitz constant estimates for training 1-Lipschitz classifiers. This enables computing the certified radius of a classifier in a single forward pass. Our method, LipsLev, is able to obtain $38.80$ % and $13.93$ % verified accuracy at distance $1$ and $2$ respectively in the AG-News dataset, while being $4$ orders of magnitude…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 3

Strengths

The authors address an interesting problem in NLP, which is of great relevance given the prevalence, and related risks, of large language models. From a wider perspective, it is important to devise techniques to provide robustness certifications which are practically computable. The paper has thus a strong motivation. With respect to the approach, the paper clearly presents the limitations of existing techniques and orderly discusses the proposed solution. The paper is well structured, and it se

Weaknesses

**Relevance of ReLU-activated convolutional networks** While the paper is generally well motivated and an architecture similar to the one discussed has been previously investigated, I think that further details on the relevance of the method presented would be helpful. Specifically, it is not clear how practically useful ReLU-activated, fully-connected convolutional networks are in text classification. As part of the motivation is to make NLP verification practically achievable, I think that a

Reviewer 02Rating 6Confidence 3

Strengths

1. The problem is significant and challenging. The original problem of certified robustness typically focuses on continuous, fixed-demension inputs, which is different from real-world NLP setting. The authors instead considered Levenshtein distance, which is a reasonable metric. 2. The proposed solution is interesting, efficient, and simple. This is a major advantage of this paper compared with prior approaches. 3. The writing is quite clear and readable. I enjoy reading this paper a lot.

Weaknesses

1. One major problem is the Lipschitzness constraint is so strong and perhaps harms the expressive power of resulting model, thus hampering the clean accuracy. In the standard certified robustness community in CV, constraining the Lipschitz constant does not show performance comparable to IBP. This problem is also theoretically studied in [Zhang et al. 2022]. Do you think your models in experiment suffers from this problem in clean accuracy compared with other approaches in small perturbation sc

Reviewer 03Rating 8Confidence 3

Strengths

Overall, this work is a solid academic contribution, albeit with limited prospects for scalability. 1. I like that this approach provides closed-form, deterministic certificates. It is the first paper that I have seen tackling the challenging Levenstein-distance setting without resorting to randomized approaches. 2. The paper writing is clean and structurally sensible. The problem setting is well-motivated and the necessary prior work is precisely and clearly introduced. 3. Figure and table pre

Weaknesses

1. My main concern with this approach is that a) the certified edit distances are very small (one or two characters), b) the models are tiny (only one convolutional layer), and c) I think possibility for future improvement is minimal. Layer-level Lipschitz constants have been more or less discarded by the image classification robustness community for several years now, as exponential signal attenuation makes it difficult to construct verifiable deep networks. This problem setting is arguably muc

Code & Models

Repositories

lions-epfl/lipslev
pytorchOfficial

Videos

Certified Robustness Under Bounded Levenshtein Distance· slideslive

Taxonomy

TopicsFault Detection and Control Systems · Multi-Criteria Decision Making