A Closer Look at the Adversarial Robustness of Information Bottleneck   Models

Iryna Korshunova; David Stutz; Alexander A. Alemi; Olivia Wiles; Sven; Gowal

arXiv:2107.05712·cs.LG·July 14, 2021

A Closer Look at the Adversarial Robustness of Information Bottleneck Models

Iryna Korshunova, David Stutz, Alexander A. Alemi, Olivia Wiles, Sven, Gowal

PDF

Open Access

TL;DR

This paper critically examines the adversarial robustness of information bottleneck models, revealing that they are not inherently robust and that earlier claims of improved defense were likely due to gradient obfuscation.

Contribution

It provides a comprehensive evaluation showing that information bottleneck models do not offer strong adversarial defenses when properly tested against white-box attacks.

Findings

01

Information bottleneck models are not robust against white-box $l_{ infty}$ attacks.

02

Previous claims of robustness may be due to gradient obfuscation.

03

Proper evaluation undermines earlier optimistic results.

Abstract

We study the adversarial robustness of information bottleneck models for classification. Previous works showed that the robustness of models trained with information bottlenecks can improve upon adversarial training. Our evaluation under a diverse range of white-box $l_{\infty}$ attacks suggests that information bottlenecks alone are not a strong defense strategy, and that previous results were likely influenced by gradient obfuscation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning