Certified Robustness Under Bounded Levenshtein Distance
Elias Abad Rocamora, Grigorios G. Chrysos, Volkan Cevher

TL;DR
This paper introduces LipsLev, a novel method to efficiently compute certified robustness of text classifiers against bounded Levenshtein distance perturbations, significantly improving speed and accuracy over previous methods.
Contribution
We develop the first approach to estimate Lipschitz constants for convolutional classifiers with respect to Levenshtein distance, enabling fast certification of robustness in text models.
Findings
Achieved 38.80% verified accuracy at distance 1 on AG-News
Achieved 13.93% verified accuracy at distance 2 on AG-News
LipsLev is 4 orders of magnitude faster than existing methods
Abstract
Text classifiers suffer from small perturbations, that if chosen adversarially, can dramatically change the output of the model. Verification methods can provide robustness certificates against such adversarial perturbations, by computing a sound lower bound on the robust accuracy. Nevertheless, existing verification methods incur in prohibitive costs and cannot practically handle Levenshtein distance constraints. We propose the first method for computing the Lipschitz constant of convolutional classifiers with respect to the Levenshtein distance. We use these Lipschitz constant estimates for training 1-Lipschitz classifiers. This enables computing the certified radius of a classifier in a single forward pass. Our method, LipsLev, is able to obtain % and % verified accuracy at distance and respectively in the AG-News dataset, while being orders of magnitude…
Peer Reviews
Decision·ICLR 2025 Poster
The authors address an interesting problem in NLP, which is of great relevance given the prevalence, and related risks, of large language models. From a wider perspective, it is important to devise techniques to provide robustness certifications which are practically computable. The paper has thus a strong motivation. With respect to the approach, the paper clearly presents the limitations of existing techniques and orderly discusses the proposed solution. The paper is well structured, and it se
**Relevance of ReLU-activated convolutional networks** While the paper is generally well motivated and an architecture similar to the one discussed has been previously investigated, I think that further details on the relevance of the method presented would be helpful. Specifically, it is not clear how practically useful ReLU-activated, fully-connected convolutional networks are in text classification. As part of the motivation is to make NLP verification practically achievable, I think that a
1. The problem is significant and challenging. The original problem of certified robustness typically focuses on continuous, fixed-demension inputs, which is different from real-world NLP setting. The authors instead considered Levenshtein distance, which is a reasonable metric. 2. The proposed solution is interesting, efficient, and simple. This is a major advantage of this paper compared with prior approaches. 3. The writing is quite clear and readable. I enjoy reading this paper a lot.
1. One major problem is the Lipschitzness constraint is so strong and perhaps harms the expressive power of resulting model, thus hampering the clean accuracy. In the standard certified robustness community in CV, constraining the Lipschitz constant does not show performance comparable to IBP. This problem is also theoretically studied in [Zhang et al. 2022]. Do you think your models in experiment suffers from this problem in clean accuracy compared with other approaches in small perturbation sc
Overall, this work is a solid academic contribution, albeit with limited prospects for scalability. 1. I like that this approach provides closed-form, deterministic certificates. It is the first paper that I have seen tackling the challenging Levenstein-distance setting without resorting to randomized approaches. 2. The paper writing is clean and structurally sensible. The problem setting is well-motivated and the necessary prior work is precisely and clearly introduced. 3. Figure and table pre
1. My main concern with this approach is that a) the certified edit distances are very small (one or two characters), b) the models are tiny (only one convolutional layer), and c) I think possibility for future improvement is minimal. Layer-level Lipschitz constants have been more or less discarded by the image classification robustness community for several years now, as exponential signal attenuation makes it difficult to construct verifiable deep networks. This problem setting is arguably muc
Code & Models
Videos
Taxonomy
TopicsFault Detection and Control Systems · Multi-Criteria Decision Making
