Some Fundamental Aspects about Lipschitz Continuity of Neural Networks
Grigory Khromov, Sidak Pal Singh

TL;DR
This paper empirically investigates the Lipschitz continuity of neural networks across various architectures and datasets, revealing key behaviors like the double descent trend and effects of label noise on smoothness and generalisation.
Contribution
It provides a comprehensive empirical analysis of neural network Lipschitz bounds, highlighting their fidelity, trends, and the impact of label noise on model properties.
Findings
Lower Lipschitz bounds show remarkable fidelity.
Double Descent trend observed in Lipschitz bounds.
Label noise affects function smoothness and generalisation.
Abstract
Lipschitz continuity is a crucial functional property of any predictive model, that naturally governs its robustness, generalisation, as well as adversarial vulnerability. Contrary to other works that focus on obtaining tighter bounds and developing different practical strategies to enforce certain Lipschitz properties, we aim to thoroughly examine and characterise the Lipschitz behaviour of Neural Networks. Thus, we carry out an empirical investigation in a range of different settings (namely, architectures, datasets, label noise, and more) by exhausting the limits of the simplest and the most general lower and upper bounds. As a highlight of this investigation, we showcase a remarkable fidelity of the lower Lipschitz bound, identify a striking Double Descent trend in both upper and lower bounds to the Lipschitz and explain the intriguing effects of label noise on function smoothness…
Peer Reviews
Decision·ICLR 2024 poster
The paper is very nice to read (if the appendix had been printed aside), it covers the topic pretty well and its relationship with related works is well described. It is mostly an experimental study and it seems to me that the methodology is correct. A lot of experiments in various regime with many different real life instances are quite convincing to me. All experiments lead to a reasonable or theoretically supported interpretation. I quite appreciate that many experiences are real life: wit
The article covers many subject and proposes many illustrations of their finding. However a limitation of this work relies in the lack of theoretical insights on the different findings that are discussed (which is an easy weakness to raise for any experimental paper, I admit). Reading this article is a constant back-and-forth between the main article and its appendix. It often feel that the main article is a glossary to the appendix. As such it often feels like the article should be 'vectorized
- This paper conducted extensive experiments to showcase its findings and offers a comprehensive exploration of experimental details and discussions. - The paper raised intriguing facets of Lipschitz continuity within neural network models, which are likely to attract substantial interest from the deep learning community aiming to develop theory and practical algorithms based on these observations.
- While the paper provides thorough experiments and in-depth discussions, its novelty might be subject to question. As also mentioned in the paper, there is concurrent research with a similar focus that has also highlighted the connection between the Lipschitz constant and Double Descent, although they tracked only an estimate of the Lipschitz constant. I appreciate the authors’ efforts in sharing more empirical observations and discussions. But I am not very confident that the paper is complete
Extensive experiment to show the different aspects in the discussion, as well as to provide evidence for their intuition and hypothesis.
Less theoretical explanation of the different aspects discussed.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
MethodsTest
