Distilling Out-of-Distribution Robustness from Vision-Language   Foundation Models

Andy Zhou; Jindong Wang; Yu-Xiong Wang; Haohan Wang

arXiv:2311.01441·cs.LG·February 6, 2024·1 cites

Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models

Andy Zhou, Jindong Wang, Yu-Xiong Wang, Haohan Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a lightweight framework combining knowledge distillation and data augmentation to significantly enhance out-of-distribution robustness in vision models, leveraging robust foundation models as teachers.

Contribution

It demonstrates that large pretrained models serve as effective teachers for robustness and proposes Discrete Adversarial Distillation (DAD) using VQGAN for improved data augmentation.

Findings

01

Strong out-of-distribution robustness gains

02

Improved clean accuracy across architectures

03

Minor computational overhead

Abstract

We propose a conceptually simple and lightweight framework for improving the robustness of vision models through the combination of knowledge distillation and data augmentation. We address the conjecture that larger models do not make for better teachers by showing strong gains in out-of-distribution robustness when distilling from pretrained foundation models. Following this finding, we propose Discrete Adversarial Distillation (DAD), which leverages a robust teacher to generate adversarial examples and a VQGAN to discretize them, creating more informative samples than standard data augmentation techniques. We provide a theoretical framework for the use of a robust teacher in the knowledge distillation with data augmentation setting and demonstrate strong gains in out-of-distribution robustness and clean accuracy across different student architectures. Notably, our method adds minor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lapisrocks/DiscreteAdversarialDistillation
pytorchOfficial

Videos

Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsKnowledge Distillation