Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness
Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung, Che-Rung Lee

TL;DR
This paper introduces a cost-effective Lipschitz continuity-based method that enhances adversarial robustness of neural networks without extensive data or high computational costs, making robust models more practical.
Contribution
The paper presents a novel, efficient approach leveraging Lipschitz continuity to improve adversarial robustness without additional data or gradient estimation.
Findings
Reduces computational overhead compared to traditional adversarial training
Maintains or improves robustness without extra generative data
Easily integrates with existing adversarial training frameworks
Abstract
As deep neural networks (DNNs) are increasingly deployed in sensitive applications, ensuring their security and robustness has become critical. A major threat to DNNs arises from adversarial attacks, where small input perturbations can lead to incorrect predictions. Recent advances in adversarial training improve robustness by incorporating additional examples from external datasets or generative models. However, these methods often incur high computational costs, limiting their practicality and hindering real-world deployment. In this paper, we propose a cost-efficient alternative based on Lipschitz continuity that achieves robustness comparable to models trained with extensive supplementary data. Unlike conventional adversarial training, our method requires only a single pass over the dataset without gradient estimation, making it highly efficient. Furthermore, our method can…
Peer Reviews
Decision·Submitted to ICLR 2025
The logic flow of this work is coherent, and the writing is clear and easy to understand. The idea of optimizing the Lipschitz constant to achieve certified adversarial robustness is also a classic approach. The implementation is very straightforward, and the proposed forged function can serve as a plug-in module that integrates with any CNN or transformer-based model.
Theoretical side: The bound is somewhat too loose. This paper derives the final optimization objective through the Gershgorin Circle Theorem, but this bound lacks guarantees due to the multiple assumptions made. Empirical side: On one hand, from Tables 1 and 2, it seems that the forged function does not show significant improvements; in fact, when combined with other robust methods, the accuracy under AutoAttack even decreases. On the other hand, the robustness of the forged function itself lac
The proposed forged function based on Lipschitz continuity is implemented during the inference phase, eliminating the need for retraining or model parameter adjustments. Compared to traditional adversarial training, this method offers a significant computational cost advantage, making it more efficient for applications with limited resources.
1. Please distinguish between the use of `\cite{}` and `\citep{}`. 2. The authors are encouraged to open-source their code. 3. In the "Related Work" section, the authors should mention the names of methods alongside author names to aid reader comprehension. 4. In Algorithm 1, Forged Function, please add a description of the hyperparameter $c^r$ in the `require` section. 5. AutoAttack is not the latest attack method; the authors are encouraged to use more advanced black-box attack methods, as ref
1. The method is highly cost-effective as it does not require model retraining or additional data. Moreover, due to its plug-and-play nature, the proposed method can be easily integrated with existing algorithms. 2. The authors provide detailed theoretical insights into the relationship between the Lipschitz constant and adversarial robustness. 3. The paper is well-organized and clearly presented, with extensive experiments across diverse datasets that validate the robustness of the proposed m
1. The experimental results, along with the authors' own analysis, indicate that the proposed method may reduce the classification accuracy on clean samples in some cases. While the authors discuss possible reasons, the paper could further analyze this trade-off and explore strategies to mitigate accuracy loss. 2. The choice of parameter $c^r$ is crucial to performance, but finding a single $c^r$ that performs well across all tasks is challenging based on the results presented. For example, $c^
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research
