A Recipe for Improved Certifiable Robustness

Kai Hu; Klas Leino; Zifan Wang; Matt Fredrikson

arXiv:2310.02513·cs.LG·June 25, 2024·1 cites

A Recipe for Improved Certifiable Robustness

Kai Hu, Klas Leino, Zifan Wang, Matt Fredrikson

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper enhances certifiable robustness of neural networks by optimizing Lipschitz-based methods, introducing novel architectural components and data augmentation to significantly improve verification accuracy on benchmarks.

Contribution

The work introduces a comprehensive evaluation and novel design techniques, including Cholesky-orthogonalized residual layers, to advance Lipschitz-based certification methods.

Findings

01

Significant improvement in deterministic verification accuracy on benchmarks.

02

Introduction of Cholesky-orthogonalized residual layers enhances network capacity.

03

Up to 8.5 percentage points increase in verification accuracy achieved.

Abstract

Recent studies have highlighted the potential of Lipschitz-based methods for training certifiably robust neural networks against adversarial attacks. A key challenge, supported both theoretically and empirically, is that robustness demands greater network capacity and more data than standard training. However, effectively adding capacity under stringent Lipschitz constraints has proven more difficult than it may seem, evident by the fact that state-of-the-art approach tend more towards \emph{underfitting} than overfitting. Moreover, we posit that a lack of careful exploration of the design space for Lipshitz-based approaches has left potential performance gains on the table. In this work, we provide a more comprehensive evaluation to better uncover the potential of Lipschitz-based certification methods. Using a combination of novel techniques, design optimizations, and synthesis of…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

* By combining technical improvements on three aspects as mentioned in the summary, the paper shows a significant empirical improvement over previous works across all the datasets (e.g., +8% on CIFAR-10). * This work provides suggestions on better settings for the robust training, in terms of model architecture with additional layers, building orthogonal layers with Cholesky decomposition, and data augmentation with a newer diffusion model.

Weaknesses

* The paper looks like manually searching for settings (model architecture, orthogonal layers, diffusion model). It has engineering merits. But it does not have much novel contribution by adding more dense layers and replacing the diffusion model already used in Hu et al,, 2023 with a newer diffusion model. * The benefits of the best choices found by the paper are not well explained. For example, the paper only explains that the Cholesky-base orthogonalization is more numerically stable and fast

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 2

Strengths

1. This work studies the limitation for Lipschitz-based certification and proposed new architectures to mitigate the issue. 2. Strong empirical result: experiments showed noticeable improvement over the baseline models.

Weaknesses

The authors need to include some intuitions when designing the layers.

Reviewer 03Rating 8· accept, good paperConfidence 3

Strengths

It finds that an apparent limitation preventing prior work from discovering the full potential of Lipschitz-based certification stems from the framing and evaluation setup. Specifically, most prior work is framed around a particular novel technique intended to supersede the state-of-the-art, necessitating evaluations centered on standardized benchmark hyperparameter design spaces, rather than exploring more general methods for improving performance (e.g., architecture choice, data pipeline, etc.

Weaknesses

In section 4.3, it seems to mainly discuss the comparison with RS based methods. But Table 5 shows several other works which can achieve better performance. It is better to also discuss the comparison with these works. Currently it seems that table 5 only shows the results without detailed discussions for these works.

Code & Models

Repositories

hukkai/liresnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing

MethodsAverage Pooling · Kaiming Initialization · 1x1 Convolution · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Residual Block · Residual Connection · Global Average Pooling · Max Pooling