The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing
Blaise Delattre, Alexandre Araujo, Quentin Barth\'elemy, Alexandre, Allauzen

TL;DR
This paper explores the relationship between Lipschitz constant, variance, and margin in randomized smoothing, proposing a new certification method that improves robustness and certified radius of neural networks against noise and adversarial attacks.
Contribution
It introduces a novel approach to leverage the variance-margin trade-off and Bernstein's inequality to enhance certified robustness in randomized smoothing.
Findings
Significant improvement in certified accuracy over state-of-the-art methods.
Effective use of pre-trained models for zero-shot certification radius enhancement.
Enhanced bounds and probabilistic analysis improve robustness guarantees.
Abstract
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. The certified radius in this context is a crucial indicator of the robustness of models. However how to design an efficient classifier with an associated certified radius? Randomized smoothing provides a promising framework by relying on noise injection into the inputs to obtain a smoothed and robust classifier. In this paper, we first show that the variance introduced by the Monte-Carlo sampling in the randomized smoothing procedure estimate closely interacts with two other important properties of the classifier, \textit{i.e.} its Lipschitz constant and margin. More precisely, our work emphasizes the dual impact of the Lipschitz constant of the base classifier, on both the smoothed classifier and the empirical variance. To increase the…
Peer Reviews
Decision·ICLR 2024 poster
This paper identifies an interesting circular dependency among each ingredient in the RS procedure, and propose a principled way to improve current certification radius in a zero-shot manner. In general, the paper is well written and each statement seems to be well supported by proof, empirical evaluation and intuitive explanation. Especially Figure 1 did a great job in summarizing the contribution of this paper. Experiment results shows superior performance to the current state-of-the-art met
1. In terms of presentation, I feel the authors could add an overview paragraph for section 3. Sometimes I fail to connect each sub-sections. It may be good to map the flow of section 3 to Figure 1. 2. It will be good to discuss the computational cost of LVM-RS comparing to RS and deterministic Lipschitz.
1. RS is currently the state of the art certification method for obtaining certificates of robustness against $\ell_p$ attacks. However, the interaction of the properties of the base classifier with the final certificate is not very well understood. This paper attempts an understanding of how the Lipschitz constant of the base classifier affects the final certificate — this is an interesting problem to study and is potentially very useful for the community. 2. Further, the paper revisits compu
1. Writing: The writing of the paper is quite wordy in many places, and confusing at times, leading to major issues in readability and clarity. Details: - Abstract: It is unclear what “variance introduced by RS”, “simplex projection”, “variance-margin tradeoff” mean. It is unclear what does bernstein’s inequality have to do with any of this. - Introduction: Would be good to define prediction margin. All methods with certified radius deal with a way of controlling /estimating the Lipschitz c
### Originality The work takes a new look at an existing topic, by incorporating Lipschitz constraints into randomized smoothing. It borrows tools from different fields, like concentration inequalities, projections onto the simplex (argmax, softmax, sparsemax), and robustness certification against adversarial attacks. ### Clarity Fig 1. helps a lot to the understanding. ### Significance The proposed algorithm LVM-RS can be seamlessly ingrated into existing frameworks, it is very general (
I struggled a lot to understand the paper at times. There is a lack of details, and even with the help of the related literature (e.g. Cohen et al) I couldn't be sure of what the authors were trying to say. My questions are detailed below. Clarifications would benefit a lot to the paper.
Videos
Taxonomy
TopicsSpeech and Audio Processing · Face and Expression Recognition
MethodsRandomized Smoothing · Balanced Selection
